%! % PS WEBSITE SEARCH ENGINE QUERY LOG EXTRACTOR GRABSRCH.PS % ======================================================== % Copyright c 1999 by Don Lancaster and Synergetics, Box 809, Thatcher, AZ, 85552 % (520) 428-4073 don@tinaja.com http://www.tinaja.com % Consulting services available per http://www.tinaja.com/info01.html % All commercial rights and all electronic media rights fully reserved. % Personal use permitted provided header and entire file remains intact. % Linking welcome. Reposting expressly forbidden. % version 1.2 % This specialized PostScript-as-language routine reads a specified file line by % line, scans each line for ABSENCE a key phrase, and saves only those lines containing % the phrase to a new file. In this example, log files are searched for the absence % of a self-reference and the presence of a search engine marker. Leaving only valid % query strings from external search engines. % GRABREFS.PS is a similar program that extracts ONLY the other referral queries. % More detail in http://www.tinaja.com/glib/muse138.pdf % To use this program, enter the full path sourcefile and target file names % and keyphrase below and resave. Then send to Acrobat Distiller. % Note that a NO FILE PRODUCED message is normal and expected. % IMPORTANT: Be sure to use "\\" when you mean "\" in any PostScript string! % Be sure to remove any ending nulls in your referral log file. /sourcefilename (C:\\medocs\\logfile.txt) def % the name of the input file /targetfilename (C:\\medocs\\searchfile.txt) def % extracted search output file /searchphrase (tinaja) def % exclusion term to filter on /workstring 20000 string def % checkline tests to see if the referral is from your own site... /checkline {addtooutfile1} def % further tests go here % addtooutfile1 filters for valid search engine query... /addtooutfile1 {(http://) search % extract referral string {pop pop dup (?) search % but only when an actual {addtooutfile }if % search engine query }if } def % /startoutfile creates an outputfile object... /ws {writefile exch writestring} def % disk write utility /startoutfile {targetfilename (w+) file /writefile exch def % make a file to write (\n\nLines containing a search engine) ws % label the file and criteria ("?" in document ) ws sourcefilename ws (:\n\n) ws } def % /addtooutfile adds the current url line to the report file... /addtooutfile {ws (\n) ws ws ws (\n\n) ws} def % /endoutfile finishes up and closes the output file... /endoutfile {(\n\n) ws % outdent writefile closefile } def % this is the main loop. It reads one line of the source pdf file at a time % for processing... /grabphrase {sourcefilename (r) file /workfile exch def % make a file to read startoutfile % start output file {mark workfile workstring readline % read one line at a time {checkline}{exit} ifelse % test lines till done cleartomark % just in case sloppy } loop endoutfile % complete output file pop } def % This actually does it... grabphrase % extract search engine queries %% EOF