Looking for a document indexer

Freaker

New member
I'm looking for a program that will go though a directory of MS Word documents, and then create an index of each word and which documents it occurs in, and has to be compatible with cyrilic characters.
Any suggestions?
<P ID="signature"></P>
 
I have a directory containing russian documents, and I want an index file telling me which files each word occurs in.
For example, I want the index to say да occurs in(Document1, Document3, Document5), and нет does in(Document2, Document4).
I don't want to search, I just want a human readable index file.
<P ID="signature"></P>
 
desktop.google.com
<P ID="signature"><marquee direction=right scrollamount=10>
hsrun.gif
</marquee></P>
 
> I don't want to search, I just want a human readable index
> file.
>

hehe...

grep (args) > index.txt ?

i have no idea. :p
<P ID="signature">--

http://www.oddigy.com
beadsprites and PSFs, oh my!</P>
 
> No idea what you're looking for really. The closest I can
> think of is outdated software called Agent Ransack that
> finds keywords in plain text files (doesn't work with MS
> Word).

It can find raw ASCII text in any file, Word or otherwise, although it doesn't recognize line breaks other than plain ASCII CRs and LFs, or any sort of special formating that is specific to a particular file format. For stuff like that you need the shareware version (FileLocator Pro), which allows interpreters for particular file types like PDF or Word.
<P ID="signature"><center>
<a href=http://1001insomniacnights.com><img src=http://pages.nyu.edu/~jc73/misc/1k1IN.gif border=0>
1k1IN:</a><font color=#903030> A Dark Comedy About 2 Roomates</font></center></P>
 
> I'm looking for a program that will go though a directory of
> MS Word documents, and then create an index of each word and
> which documents it occurs in, and has to be compatible with
> cyrilic characters.
> Any suggestions?

http://www.copernic.com/Copernic</a>?
<P ID="signature"><div align="center"><font size=2>http://dan.panicus.orgDan's Space</a></font></div></P>
 
> I'm looking for a program that will go though a directory of
> MS Word documents, and then create an index of each word and
> which documents it occurs in, and has to be compatible with
> cyrilic characters.
> Any suggestions?
>
Don't know what OS you're running, but the Indexing Service in XP seems to do full text indexing...

If you have it installed, you can add a directory to the catalog for indexing. Once it's indexed, right-click My Computer and select Manage. Go to the Services and Applications section. From there you'll see a System tree item, under which there's a Query the Catalog entry. I think with the advanced search capabilities, you can even use regular expression queries. I assume it would support any character set that Windows does, but I've never tried it with foreign characters so I don't know for certain.
<P ID="signature"></P>
 
Many of the listed programs would be perfect, but I'm doing this for computer illiterate people, and rather then searching the index I want to print it out.
I'm going to take a stab at making one myself, I can't find anything to do what I want.
<P ID="signature"></P>
 
Back
Top Bottom