[PD] Fastest way to find lines in text file

Jack jack at rybn.org
Wed Mar 22 14:34:40 CET 2017


I guess my 2 precedent mails were enough clear.
But i will answer at each point :

1) My previous mails :
I need to find every lines of a textfile containing a word.
The textfile has 2.539.592 lines.
Now, i am using [msgfile] from zexy because i can find a line, skip a
line and find again ... until the end of the textfile.
But, i am wondering if there is an other object (in an other library)
faster, specialized in this work ?
...
The textfile has only two "strings" by line.
Here, 20 lines of the textfile :

345594 577427
345594 567267
345594 528911
345594 534435
345594 523087
345595 374384
345595 377303
345595 380544
345595 379911
345595 557020
345595 552396
345595 562487
345595 460842
345595 428449
345595 424095
345596 447676
345598 579883
345598 379495
345598 379039
345598 380328

2) See above
3) See above
4) See above
5) Linux/Ubuntu 16.10/Pd 0.47.1
6) you abuse :)

++

Jack




Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit :
> Hi,
> 
> On 22/03/2017 13:01, Jack wrote:
>> I need to find all instances that math to the first row.
>> It is not possible with [text search] if i am right.
> 
> I think you should outline your use case/problem in more detail. This
> should be a good practice when asking for support on the Mailing List.
> 
> Example:
> 
> 1) I have a text file where each line contains a two integers separated
> by a space (" ") char - such as (possibly paste a part of the file on
> pastebin or similar too).
> 213214 12313
> 123223 13213
> 
> 2) My file is [always/at least/circa/ ...] 2,539,592 lines long
> 
> 3) My algorithm should find all subsequent lines matching the first line
> in the file and return [all line numbers for matches / the total count
> of matched lines / ...]
> 
> 3) I want the algorithm to be [as fast as possible / run in under 1
> second / run in under 1ms / ... ]
> 
> 4) I [want to / do not need to] use Pd Vanilla
> 
> 5) My patch should run on [All platforms / Windows / OSX / Linux / ...]
> 
> 6) My patch should run [on potentially any machine / on a Raspberry Pi /
> on a 1990s 386 machine / on my digital toaster where I have compiled a
> custom version of Pd / ... ]
> 
> :)
> 
> 
>> ++
>>
>> Jack
>>
>>
>>
>> Le 22/03/2017 à 08:27, Liam Goodacre a écrit :
>>> You can also use [text search], although t's not so easy to find more
>>> than the first instance. If you don't mind taking a extra step, you
>>> could give each line a third term, which is the line number. Then you
>>> can use the "> 3" argument for [text search] to find matches s
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Pd-list <pd-list-bounces at lists.iem.at> on behalf of Jack
>>> <jack at rybn.org>
>>> *Sent:* 21 March 2017 18:14
>>> *To:* pd-list at lists.iem.at
>>> *Subject:* [PD] Fastest way to find lines in text file
>>>
>>> Hello,
>>>
>>> I need to find every lines of a textfile containing a word.
>>> The textfile has 2.539.592 lines.
>>> Now, i am using [msgfile] from zexy because i can find a line, skip a
>>> line and find again ... until the end of the textfile.
>>> But, i am wondering if there is an other object (in an other library)
>>> faster, specialized in this work ?
>>> Thanx.
>>> ++
>>>
>>> Jack
>>>
>>>
>>> _______________________________________________
>>> Pd-list at lists.iem.at mailing list
>>> UNSUBSCRIBE and account-management ->
>>> https://lists.puredata.info/listinfo/pd-list
>>>
>>>
>>> _______________________________________________
>>> Pd-list at lists.iem.at mailing list
>>> UNSUBSCRIBE and account-management ->
>>> https://lists.puredata.info/listinfo/pd-list
>>>
>>
>>
>> _______________________________________________
>> Pd-list at lists.iem.at mailing list
>> UNSUBSCRIBE and account-management ->
>> https://lists.puredata.info/listinfo/pd-list
>>
> 
> _______________________________________________
> Pd-list at lists.iem.at mailing list
> UNSUBSCRIBE and account-management ->
> https://lists.puredata.info/listinfo/pd-list




More information about the Pd-list mailing list