#!/usr/bin/perl # $bapropos = q` mira(1r) - text-only, command-line oriented web browser `; $esc = "\033"; $g_bold = $esc . "[1m"; $g_normal = $esc . "[0m"; $zn = "\005"; $b = $g_bold; $n = $g_normal; $cmdhelp = " MIRA (Munafo's Internet Research Assistant) is a simple web browser created to satisfy the following objectives (listed by priority): - When you quit or restart the browser after a crash, it retains the history ('back' command's URL list) you had before the quit/crash - Keep a permanent record of where you have been and the text you have read while searching the Internet - Allow viewing of old web pages (and old versions of current web pages) even after the original server no longer has them - Allow the user to easily search through all this old saved text for one or more words or a phrase, etc. - Retain simplicity by remaining entirely text-only (like $b"."lynx$n) and rely on external programs (like $b"."xv$n and $b"."arena$n) to view graphics and graphics-intensive pages "; $help = qq` NAME MIRA - text-only, command-line oriented web browser DESCRIPTION $cmdhelp For more perl goodness, go to mrob.com/pub/perl SEE ALSO bget - Download or stream data from a URL to a file or stdout strip-wayback - Remove archive.org headers and links from HTML `; $unused_header = q` Revision History: 19980706 Discover nested forms in Hotbot results, and trying to submit something from the outer form that occurs after the end of the inner form doesn't work. Fix by using shift and unshift to keep a stack of form numbers. 19980723 Add 'gsu' command to make global history searches easier! 19980916 Add routines to save and restore the "local history stack". (period of many undocumented changes) 19981218 Begin recording dates in global log (g_log) file 19990126 Over the last few days I have added color-coding to show what links have been visited, and of those that have not, which are on the same host as the current page. This aids in the (recently more frequent) practice of manually loading everything about a given subject just so I can have the text in the archive. 19990127 add support for viewing PDF's through gv, but gv doesn't seem to display the PDF's properly. 19990201 #glr# command works. (#ghr# is still pending). Fix bugs in #gsc# and add default command to #gsc# making it easy to continue the search to older pages. 19990208 Form submissions now include values of checked "radio" items, and recognize numerical values of "select" items; this makes DejaNews queries work properly. 19990209 #f# command now supports a numeric argument; added #sf# command; #bm# now prints the title and URL it's adding 19990211 Now you can type #rf# to view form fields after you've filled them in; #m# and #vs# commands include similar changes. 19990212 Now you can type "a .." from a URL that ends in "/" and it will go up one directory. This complments the use of "a " from URLs that don't end in "/". 19990216 Added #ga# command allowing jump to an anchor based on text in the anchor's label. Often useful on pages returned by search engines, where anchors like "Next" occur with an unpredictable number but a predictable name. 19990219 Hotbot has now set up their links to be queries to HotBot, which return a page containing a redirected URL. Modify MIRA to handle this by retroactively changing its internal URLs (in anc_base, stack, global history and log) and copying the already-loaded data into another place in the cache. 19990223 Fix a bug that made DejaNews queries not get cached 19990224 Yet another bug that made redirected pages (like HotBot hits) not get cached. This time it's fixed a lot better, I found that there was some confusion as to where the extra cache copy should be created and decided to put it completely in load_n_cache. 19990301 formatting chain now generates an output file containing plain ASCII output; #p# now has an option to save in plain ASCII format. #p#'s (S) and (A) options append if file already exists. 19990302 #p# command switches color to magenta. 19990303 Fix some bugs in newline formatting to plain ASCII output. 19990304 Add URL and date stamp to beginning of plain ASCII output file. #gsc# now takes a quoted argument to allow searching for strings containing spaces. 19990308 add_anchor now gleans hrefs from within query hrefs, essentially restoring direct links to HotBot query results pages (see 19990219). This allows the user to see if the link has been visited, very useful when a new search returns links visited through a previous search. Also, direct links are often faster and more reliable. Make AREA anchors have colors just like normal anchors. 19990310 AREA labels now show part of their href text to make them less ambiguous. Fixed bug with #f# in GHR mode. 19990312 Found a page that ends anchors with instead of . 19990325 add_anchor1 now recognizes "javascript:openWindow" in URLs.