[OT:] Wilbur indexer ? JT?

A

Active8

Guest
Someone posted something about Wibur. I was wondering if anyone
using it has managed to get it to work properly with pdftotext.exe.
I've been talking to the author and he claims it works on many
machines. For me, pdftptext works from the command line, but not
from Wilbur.
--
Best Regards,
Mike
 
On Thu, 24 Mar 2005 11:01:42 -0500, Active8 <reply2group@ndbbm.net>
wrote:

Someone posted something about Wilbur. I was wondering if anyone
using it has managed to get it to work properly with pdftotext.exe.
I've been talking to the author and he claims it works on many
machines. For me, pdftptext works from the command line, but not
from Wilbur.
I haven't tried it yet, but that's next.

FLpro doesn't quite hack what I need to do, although it does have
extensive regular expression capability.

...Jim Thompson
--
| James E.Thompson, P.E. | mens |
| Analog Innovations, Inc. | et |
| Analog/Mixed-Signal ASIC's and Discrete Systems | manus |
| Phoenix, Arizona Voice:(480)460-2350 | |
| E-mail Address at Website Fax:(480)460-2142 | Brass Rat |
| http://www.analog-innovations.com | 1962 |

I love to cook with wine. Sometimes I even put it in the food.
 
On Thu, 24 Mar 2005 09:45:40 -0700, Jim Thompson wrote:

On Thu, 24 Mar 2005 11:01:42 -0500, Active8 <reply2group@ndbbm.net
wrote:

Someone posted something about Wilbur. I was wondering if anyone
using it has managed to get it to work properly with pdftotext.exe.
I've been talking to the author and he claims it works on many
machines. For me, pdftptext works from the command line, but not
from Wilbur.

I haven't tried it yet, but that's next.

FLpro doesn't quite hack what I need to do, although it does have
extensive regular expression capability.

FLpro? No returns but FilePro... maybe it was FileMaker - that POS I
converted for a local plastics extrusion company. flpro.com is about
AutoTrackXP. So whatchoo talking about?

Wilbur is pretty good so far, but the author got sucked into using
MFC for his winders ports back when he did it. The open source will
be a bitch if I have to fix or enhance it myself :(

You'll need to download pdftotext.exe and drop it somewhere in your
path to test it - like windows\system32, or whatever. pdftotext does
a good conversion from the command line. The problem is that when I
let Wilbur do it, he doesn't return any pdfs when he builds the
index. Take pdftotext out and Wibur returns pdfs, but in the
contents pane, you see the serch term along with all the pdf tags.

regex? Sorry. Wildcards and booleans only. Plus a "near" operator
"<". Wilbur doesn't save word spacings.

Speaking of Adobe, CoolEdit must have been *too* cool because Adobe
bought it.
--
Best Regards,
Mike
 
On Thu, 24 Mar 2005 11:35:05 -0700, Jim Thompson wrote:

On Thu, 24 Mar 2005 13:18:09 -0500, Active8 <reply2group@ndbbm.net
wrote:

On Thu, 24 Mar 2005 09:45:40 -0700, Jim Thompson wrote:

On Thu, 24 Mar 2005 11:01:42 -0500, Active8 <reply2group@ndbbm.net
wrote:

Someone posted something about Wilbur. I was wondering if anyone
using it has managed to get it to work properly with pdftotext.exe.
I've been talking to the author and he claims it works on many
machines. For me, pdftptext works from the command line, but not
from Wilbur.

I haven't tried it yet, but that's next.

FLpro doesn't quite hack what I need to do, although it does have
extensive regular expression capability.

FLpro? No returns but FilePro... maybe it was FileMaker - that POS I
converted for a local plastics extrusion company. flpro.com is about
AutoTrackXP. So whatchoo talking about?

Wilbur is pretty good so far, but the author got sucked into using
MFC for his winders ports back when he did it. The open source will
be a bitch if I have to fix or enhance it myself :(

You'll need to download pdftotext.exe and drop it somewhere in your
path to test it - like windows\system32, or whatever. pdftotext does
a good conversion from the command line. The problem is that when I
let Wilbur do it, he doesn't return any pdfs when he builds the
index. Take pdftotext out and Wibur returns pdfs, but in the
contents pane, you see the serch term along with all the pdf tags.

regex? Sorry. Wildcards and booleans only. Plus a "near" operator
"<". Wilbur doesn't save word spacings.

Speaking of Adobe, CoolEdit must have been *too* cool because Adobe
bought it.

FLpro = File Locator Pro, but it won't do AND.

I need to find files containing two or more words (or phrases), not
necessarily adjacent, in the file.
I'm guessing you don't have Linux? If you had, you could do it with
a little clever "scripting" - search a directory-ful of files for one
word, capture filenames in the output, pipe that output to grep[0] again,
on the other word.
[0] "global regular expression parser"

lessee ...
$ for F in `grep WORD1 * | cut -f1 -d ':' | uniq` ; do \
if [[ `grep WORD2 $F` ]] ; then echo $F ; fi ; done

And you can even make it pretty:
$ for F in `grep WORD1 * | cut -f1 -d ':' | uniq` ; do \
; then \
echo $F ; \
fi ; \
done
#;-)

Cheers!
Rich
 
On Thu, 24 Mar 2005 11:35:05 -0700, Jim Thompson wrote:

<snip>
FLpro = File Locator Pro, but it won't do AND.

I need to find files containing two or more words (or phrases), not
necessarily adjacent, in the file.
Then Wilbur will work. Normally, if you had a prob with a prog, I'd
check versions. Craig finally showed me his usage output from
pdftotext - the -eol switch didn't work for me and he has it
hardwired in to Wilbur. You'll need version 1.0 or later and the
site I got mine from still has an old version.

There's newer versions, but (it's not in the help file)
wilbur.redtree.com/pdf links to stuff that works with Wilbur

It rocks now. pdftotext just mangles equations and maybe a bit of
text.
--
Best Regards,
Mike
 
On Thu, 24 Mar 2005 19:34:07 GMT, Rich Grise wrote:
<snip>
I'm guessing you don't have Linux? If you had, you could do it with
a little clever "scripting" - search a directory-ful of files for one
word, capture filenames in the output, pipe that output to grep[0] again,
on the other word.
[0] "global regular expression parser"
One of my favorite acronyms :)
lessee ...
$ for F in `grep WORD1 * | cut -f1 -d ':' | uniq` ; do \
if [[ `grep WORD2 $F` ]] ; then echo $F ; fi ; done

And you can even make it pretty:
$ for F in `grep WORD1 * | cut -f1 -d ':' | uniq` ; do \
if [[ `grep WORD2 $F` ]] ; then \
echo $F ; \
fi ; \
done
#;-)

Hey. Thanks for the headstart. I need to wait 'til I can get a new
Mandrake from linuxiso.org (no high speed yet and my friend has
disappeared from the net) Then I'll do stuff like that. My old
redhat had some kind of indexing set up in a cron job, but I think
that was just for speeding up locate, appropose, and find commands -
can't remember.
--
Best Regards,
Mike
 
On Thu, 24 Mar 2005 15:33:08 -0500, Active8 wrote:

On Thu, 24 Mar 2005 19:34:07 GMT, Rich Grise wrote:
snip

[0] "global regular expression parser"

One of my favorite acronyms :)
Un*x is full of delightful and whimsical command names. One that
tickles me is "bison". Back in the dark days (when I was running
Coherent), we had yacc, meaning "yet another compiler compiler". Later, a
similar, improved utility emerged, called "bison". Why? Because a bison is
quite like a yak, but different.

--
"Electricity is of two kinds, positive and negative. The difference
is, I presume, that one comes a little more expensive, but is more
durable; the other is a cheaper thing, but the moths get into it."
(Stephen Leacock)
 
On Thu, 24 Mar 2005 22:09:28 -0800, JeffM wrote:

I need to find files containing two or more words (or phrases), not
necessarily adjacent, in the file.
Jim Thompson

I'm guessing you don't have Linux?
...pipe that output to grep
Rich Grise

I already suggested it (and went one better).
http://groups-beta.google.com/group/sci.electronics.cad/browse_frm/thread/dc0c583da1c41ffd/de25ce96d1b9b1b2?q=grep-for-Windows

I think Jim has me plonked.
It's almost amusing watching Jim cut off his nose to spite his face.

Cheers!
Rich
 
On Fri, 25 Mar 2005 13:10:15 +0000, Fred Abse wrote:

On Thu, 24 Mar 2005 15:33:08 -0500, Active8 wrote:

On Thu, 24 Mar 2005 19:34:07 GMT, Rich Grise wrote:
snip

[0] "global regular expression parser"

One of my favorite acronyms :)

Un*x is full of delightful and whimsical command names. One that
tickles me is "bison". Back in the dark days (when I was running
Coherent), we had yacc, meaning "yet another compiler compiler". Later, a
similar, improved utility emerged, called "bison". Why? Because a bison is
quite like a yak, but different.
I do like the ones that actually mean something - what bugs me is
the ones like awk - some kind of programming language, invented by
Aho, Weinberger, and Kernighan. So they named it after themselves.

I expecially like Larry Wall's take on perl: Pathologically Eclectic
Rubbish Lister. ;-)
(practical extraction & report language)

Cheers!
Rich
 
On 24 Mar 2005 22:09:28 -0800, JeffM wrote:

I need to find files containing two or more words (or phrases), not
necessarily adjacent, in the file.
Jim Thompson

I'm guessing you don't have Linux?
...pipe that output to grep
Rich Grise

I already suggested it (and went one better).
http://groups-beta.google.com/group/sci.electronics.cad/browse_frm/thread/dc0c583da1c41ffd/de25ce96d1b9b1b2?q=grep-for-Windows

I think Jim has me plonked.
Thanks for bringing it up again. There aren't many windows progs
that I know of that support regex.
--
Best Regards,
Mike
 
On Fri, 25 Mar 2005 18:11:52 +0000, Rich Grise wrote:

I do like the ones that actually mean something - what bugs me is
the ones like awk - some kind of programming language, invented by
Aho, Weinberger, and Kernighan. So they named it after themselves.
Well, at least they didn't call C "K(ernighan) (and) R(itchie's)
A(pplication) P(rogrammer)

:)

--
"Electricity is of two kinds, positive and negative. The difference
is, I presume, that one comes a little more expensive, but is more
durable; the other is a cheaper thing, but the moths get into it."
(Stephen Leacock)
 
On Sat, 26 Mar 2005 11:57:41 +0000, Fred Abse wrote:

On Fri, 25 Mar 2005 18:11:52 +0000, Rich Grise wrote:

I do like the ones that actually mean something - what bugs me is
the ones like awk - some kind of programming language, invented by
Aho, Weinberger, and Kernighan. So they named it after themselves.

Well, at least they didn't call C "K(ernighan) (and) R(itchie's)
A(pplication) P(rogrammer)

:)
LOL!
Hey, this one's a keeper! When I repeat this throughout the world,
would you like me to include attribution?

Thanks!
Rich
 
On Thu, 24 Mar 2005 13:18:09 -0500, Active8 <reply2group@ndbbm.net>
wrote:

On Thu, 24 Mar 2005 09:45:40 -0700, Jim Thompson wrote:

On Thu, 24 Mar 2005 11:01:42 -0500, Active8 <reply2group@ndbbm.net
wrote:

Someone posted something about Wilbur. I was wondering if anyone
using it has managed to get it to work properly with pdftotext.exe.
I've been talking to the author and he claims it works on many
machines. For me, pdftptext works from the command line, but not
from Wilbur.

I haven't tried it yet, but that's next.

FLpro doesn't quite hack what I need to do, although it does have
extensive regular expression capability.

FLpro? No returns but FilePro... maybe it was FileMaker - that POS I
converted for a local plastics extrusion company. flpro.com is about
AutoTrackXP. So whatchoo talking about?

Wilbur is pretty good so far, but the author got sucked into using
MFC for his winders ports back when he did it. The open source will
be a bitch if I have to fix or enhance it myself :(

You'll need to download pdftotext.exe and drop it somewhere in your
path to test it - like windows\system32, or whatever. pdftotext does
a good conversion from the command line. The problem is that when I
let Wilbur do it, he doesn't return any pdfs when he builds the
index. Take pdftotext out and Wibur returns pdfs, but in the
contents pane, you see the serch term along with all the pdf tags.

regex? Sorry. Wildcards and booleans only. Plus a "near" operator
"<". Wilbur doesn't save word spacings.

Speaking of Adobe, CoolEdit must have been *too* cool because Adobe
bought it.
FLpro = File Locator Pro, but it won't do AND.

I need to find files containing two or more words (or phrases), not
necessarily adjacent, in the file.

...Jim Thompson
--
| James E.Thompson, P.E. | mens |
| Analog Innovations, Inc. | et |
| Analog/Mixed-Signal ASIC's and Discrete Systems | manus |
| Phoenix, Arizona Voice:(480)460-2350 | |
| E-mail Address at Website Fax:(480)460-2142 | Brass Rat |
| http://www.analog-innovations.com | 1962 |

I love to cook with wine. Sometimes I even put it in the food.
 

Welcome to EDABoard.com

Sponsor

Back
Top