%! % PS WEB SITE REFERRAL LOG EXTRACTOR GRABREFS.PS % ============================================== % Copyright c 1999 by Don Lancaster and Synergetics, Box 809, Thatcher, AZ, 85552 % (520) 428-4073 don@tinaja.com http://www.tinaja.com % Consulting services available per http://www.tinaja.com/info01.html % All commercial rights and all electronic media rights fully reserved. % Personal use permitted provided header and entire file remains intact. % Linking welcome. Reposting expressly forbidden. % version 1.2 % This specialized PostScript-as-language routine reads a specified file line by % line, scans each line for ABSENCE a key phrase, and saves only those lines containing % the phrase to a new file. In this example, log files are searched for the absence % of a self-reference and the absence of a search engine marker. Leaving only valid % external referrals. % GRABSRCH.PS is a similar program that extracts ONLY the search engine queries. % More detail in http://www.tinaja.com/glib/muse138.pdf % To use this program, enter the full path sourcefile and target file names % and keyphrase below and resave. Then send to Acrobat Distiller. % Note that a NO FILE PRODUCED message is normal and expected. % IMPORTANT: Be sure to use "\\" when you mean "\" in a PostScript string! % Be sure to remove any ending nulls from your log file. /sourcefilename (C:\\medocs\\logfile.txt) def % the name of the input file /targetfilename (C:\\medocs\\reffile.txt) def % extracted referral output file /searchphrase (tinaja) def % exclusion term to filter on /workstring 20000 string def % checkline tests to see if the referral is from your own site... /checkline {dup searchphrase search % look for self-referral not % if missing {pop % when found addtooutfile1} % add to output file {pop pop pop} ifelse % otherwise do nothing } def % addtooutfile1 filters for valid referral... /addtooutfile1 {(http://) search % extract referral string {pop pop dup (?) search not % but only if {addtooutfile }if % no search engines }if } def % /startoutfile creates an outputfile object... /ws {writefile exch writestring} def % disk write utility /startoutfile {targetfilename (w+) file /writefile exch def % make a file to write (\n\nReferrals NOT containing ") ws % label the file and criteria searchphrase ws (" or "?" in ) ws sourcefilename ws (:\n\n) ws } def % /addtooutfile adds the current url line to the report file... /addtooutfile {ws (\n) ws} def % /endoutfile finishes up and closes the output file... /endoutfile {(\n\n) ws % outdent writefile closefile } def % this is the main loop. It reads one line of the source pdf file at a time % for processing... /grabphrase {sourcefilename (r) file /workfile exch def % make a file to read startoutfile % start output file {mark workfile workstring readline % read one line at a time {checkline}{exit} ifelse % test lines till done cleartomark % just in case sloppy } loop endoutfile % complete output file pop } def % This actually does it... grabphrase % extract referrals %% EOF