Randal Schwartz Archive
|
Seriously? A Smalltalk web framework? While you might be skeptical at first, this just might be the the ultimate developer tinkertoy.
|
|
Randal Schwartz explains why you should consider Contextual::Return for your next tricky return value problem.
|
|
Like Barman, Perl has its own" utility belts", namely Scalar::Util and List::Util. Here's a look at what each gizmo can do.
|
|
The Moose object system enforces type, validates values, and coerces parameters to be the correct type.
|
|
Build better classes faster with the new Perl package named Moose.
|
|
Create, retrieve, update, and delete records easily
with the Rose::DB::Object object-relational mapper.
|
|
Rose::DB::Object makes typical CRUD a breeze.
|
|
Learn how to abstract database rows using
DB::Rose.
|
|
Watch IRC channels with Perl and the POE module.
|
|
Implement a nifty progress bar with a handful of modules and a smattering of code.
|
|
To many, Smalltalk remains the canonical object-oriented programming language. But Perl can leverage the best practices of Smalltalk and do more. Randal Schwartz reminisces and shares his usual pearl of wisdoms.
|
|
Take a look at Perl's confusing, but important pack and unpack functions.
|
|
The" new" Web is all shiny and collaborative, but" old school" Usenet is still chugging along. Here, Randal Schwartz connects some of the new with some of the old, scraping CPAN for news of novel Perl modules.
|
|
Learn how to combine CGI::Prototype and Apache action scripts to serve custom and template web pages.
|
|
Template Toolkit is great for dynamic sites, but it can also make the task of keeping a static site up-to-date. Perl Guru Randal Schwartz sings TT's virtues while building a site for budding karaoke stars.
|
|
Most developers use the Template Toolkit (TT) to generate dynamic web pages based on input parameters, but TT can help static web sites as well. Let’s take a look at a typical small, static website and how TT can help things.
|
|
A small amount of code produces a logger process to write web server log entries to a database.
|
|
Learn how to build a better web server log using your own mod_perl handler.
|
|
To debug Perl applications — even Web applications — just follow Randal’s three simple rules.
|
|
The Template Toolkit does not support any profiling tools “out of the box.” However, that didn’t stop Randal from getting the numbers — and the performance boost — he needed.
|
|
Learn how to automate a mini-CPAN update with yet another pearl of Perl wisdom.
|
|
Sooner or later, every Perl hacker ends up wanting to process a collection of files contained within a directory, including all the files in all the subdirectories. Thankfully, Perl comes with the File::Find module to perform this task in a tested, portable manner.
|
|
In the last two columns, I introduced my CGI::Prototype generic controller framework. This time, let’s continue the examination with a description of a real workhorse subclass, CGI::Prototype::Hidden.
|
|
Need to create a CGI application? Save time and lines of code with Randals new CGI::Prototype.
|
|
Read why Randal invents yet another CGI framework
|
|
Want to tie a complex data type? Heres how, courtesy of Perl guru Randal Schwartz.
|
|
You may have heard the term tied variable before, especially if youve accessed a DBM- based hash, but what is a tied variable and why would you use one? This month and next, let’s take a look.
|
|
As I mentioned last month, having persistent Perl code means that some steps of your application can be reused rather than repeated. One very easy optimization is keeping your database handles open between web hits, rather than reopening them on each new hit. The Apache::DBI module (found in the CPAN) does the work for you by altering the way normal DBI connections are processed.
|
|
Last month, I talked a bit about mod_perl, and how I used it extensively on my web server. But I was reminded by a few of my reviewers that I've yet to provide a good overview of mod_perl in any of my columns! Time to fix that.
|
|
In the previous three articles, I introduced my templating system of choice, the Template Toolkit (TT). Since those articles were intended as overviews, I didn't have much space to go into meaty examples. So, in this article, I'll look at how I'm using TT every day to help me manage the Stonehenge Consulting web site (http://www.stonehenge.com).
|
|
In the previous two columns, I introduced my templating system of choice, the Template Toolkit. Continuing from where I left off, let's look at some of the other features of the Template Toolkit (TT), including how to configure TT and use it from Perl, from the command line, and embedded in Apache.
|
|
In the previous "Perl of Wisdom," I introduced my templating system of choice, the aptly-named Template Toolkit (TT). Continuing from where I left off, let's look at some of TT's other features.
|
|
In some of my past columns, I've mentioned that my template system of choice is the aptly named Template Toolkit, a marvelous work by Andy Wardley. Although I've demonstrated how I've used the Template Toolkit (TT), I haven't really talked enough about what makes it so wonderfully useful. So, this month, let's take a more in-depth look at the wonders of TT.
|
|
In some of my past columns, I've mentioned that my template system of choice is the aptly named Template Toolkit, a marvelous work by Andy Wardley. Although I've demonstrated how I've used the Template Toolkit (TT), I haven't really talked enough about what makes it so wonderfully useful. So, this month, let's take a more in-depth look at the wonders of TT.
|
|
With the recent multiple and varied outbreaks of Windows-based worms generating ever-increasing loads of spam, I've been taxed as the system administrator for the company server to maintain a vigil against the attacks. While the actual worms can't infect my box, the onslaught of worm payloads (and the inevitable increase in spam from infected machines) has threatened an ongoing denial of service attack. At one point recently, I was accepting and attempting to process over 2,000 worm payloads per hour (including generating RFC-mandated bounce messages for those), as well as handling the 2,000 extremely false "you have a virus" messages thoughtfully (not!) generated by the antivirus blockers.
|
|
Although many of my columns deal with entire programs, I find that people still send me email about the basics. So, this month, I thought I'd address an issue that people seem to keep asking about: basic list manipulation.
|
|
Back in one of the very first issues of linuxdlsazine (September 1999, available online at http://www.linuxdls.com/1999-09/perl_01.html), I wrote about the Spew language. Given a description of text, sentences, and paragraphs, Spew generates random prose based on that description. The grammar is specified using a simple BNF-like format, with extensions to give weighting to more-favored choices.
|
|
Recently, I found myself hacking a web application for a customer. If you've written a web application or two, you know the type: a multi-page web form where the fields need to be validated, stored into session data, and then finally dispatched into the next phase.
|
|
One activity I find myself frequently attempting is extracting bits of useful information from existing web pages that change over some period of time. In an ideal world, everything I'd want would be provided via some RSS feed or "wholesale" SOAP web service, but in the world I still live in, I usually end up parsing the "retail" HTML provided for browser views.
|
|
In last month's column, I showed how to create a web site testing tool based on Perl's own testing framework and the WWW::Mechanize module. For reference, I've reproduced the code developed in last month's article in Listing One. The test code verifies the proper operation of a web site, in this case, http://search.cpan.org.
|
|
If you run an "always on" e-commerce site (perhaps using some of the high-availability tricks described in this issue), you must ensure that search forms really operate and that the pages pointed to have reasonable content. Validation is vital for dynamic web sites, especially those that generate an "everything's OK" 200 status when the content of the page contains a Java traceback from a database connection. To truly have high availability, you have to watch the associated programs and databases -- not just that the links on your pages all go somewhere reasonable.
|
|
When I first started playing with awk more than two decades ago, I was amazed at the ease with which common tasks could be easily solved through the use of its "array" datatype. Prior to that, I had experienced only BASIC and C arrays, where the only index available was a small integer. But awk arrays could have arbitrary strings as keys!
|
|
I admit it. Like anyone else with a decent-speed connection to the Internet, I collect a lot of images. For example, a few months ago, I described a program that looks through Yahoo! news images for pictures of Oregon and some of my favorite singing stars. Sometimes, an image travels multiple paths before it ends up on my disk, and thus gets saved under different names. But that's a waste of disk space, so I want to eliminate duplicates where I can.
|
|
More and more these days, you get faced with a problem with angle brackets somewhere in the data. How do you find what you're looking for in HTML or XML data?
|
|
A Perl program alters the outside world in some manner. Otherwise, there'd be no point in running the program. But sometimes, our Perl programs need a little "memory" to do their job, something that persists information from one invocation to the next. But how do you keep such values around?
|
|
Even though the Web is roughly a decade old and there are now many options for developing Web applications, Perl is still regarded by many as "the darling language of Web programming." Perl's text-wrangling abilities still exceed that of any other popular open source language, and a wealth of Perl modules (from the core distribution and the CPAN) makes Web applications a snap to construct and maintain.
|
|
In last month's column, I introduced the File::Find module that's included as part of the core Perl distribution. File::Find provides a framework to recursively catalog or manipulate directories and their contents.
|
|
As I type this month's column, we're just pulling away from Ocho Rios, Jamaica, on the latest Geek Cruise (http://www.geekcruises.com) called "Linux Lunacy 2." Earlier today, some of the speakers on this conference/cruise, including Linus Torvalds and Eric Raymond, held a meeting with the Jamaican Linux Users Group. Now, we're out at sea (en-route to Holland America's private island, "Half Moon Cay"), so I'm using the satellite link to upload this column (for a mere 30 cents a minute).
|
|
Last month, I showed how to fetch a subset of the CPAN (Comprehensive Perl Archive Network) to create a local mini-mirror. The subset included just the latest distribution of each module, plus the index files, so that the CPAN.pm module could install and update your local modules.
|
|
The Comprehensive Perl Archive Network, known as "the CPAN," is the "one stop shopping center" for all things Perl. This 1.2 GB archive contains over 13,000 modules for your Perl programs, as well as scripts, documentation, many non-Unix Perl binaries, and other interesting things.
|
|
Since the first version of Unix back some three decades ago, the fork() system call has been the normal way to get many things to happen at once. Forking is a very nice (some say "elegant") model of concurrent execution: individual processes have entirely separate address spaces, with little chance of interference from other tasks, at the cost of a lot of overhead for interprocess communication.
|
|
CGI applications are often used to search through some database. For example, a catalog might let you look for an item by color, or an on-line dating service might let you pick people by gender, location, age, and interests.
|
|
The CGI protocol is wonderful for the remote execution of short tasks. But how do you execute a longer task? A task can't just run without giving some kind of feedback to the user -- eventually either the user will get bored or Apache will drop the connection.
|
|
I recently decided to put the stonehenge.com Web site under CVS (Concurrent Versions System) management. With the CVS tools, I can "check out" a current version of the Web site sources, modify it as necessary, test it on a development server, and then "check in" the changes for deployment on my live server -- the same way the big boys do it. I can also let other Stonehenge druids edit portions of the site, a task that had been exclusively my job (along with the dozens of other self-appointed roles I fill at Stonehenge).
|
|
The Linux box currently hosting stonehenge.com is in a rented space at a co-location facility. As a result of the Internet shakeout happening everywhere, the co-lo facility was bought by a larger networking company and we've been having network interruptions, including complete loss of service, from time to time. The administrator of the box came to me looking for evidence that these outages had been going on for extended periods of time so that he could take that to the new owner, get some of his money back, and pass the savings along to me.
|
|
Recently, I attended a presentation at the Portland Linux Unix Group (http://www.pdxlinux.org/) by Michael Rasmussen. At one point in his talk, Michael mentioned that he needed to analyze the traffic on his company's Web server and was surprised that many of the commercial and freely available Web traffic tools do not provide satisfactory reports on the amount of traffic, either bytes per second or hits per second, during "bursts" or "spikes" in the load. But, being reasonably fluent in Perl, Michael wrote a quick script to crawl through the text Web log, and got the data he needed.
|
|
The great thing about Web servers is that they can serve more than Web pages. They can serve stuff. Sometimes that stuff is inside tarballs, those little bundles of joy that efficiently hold many files (sometimes numbering in the thousands) for convenient transferring or archiving. A recent message on the Perl Monastery (http://www.perlmonks.org) inspired me. The person known as "Screamer" posted a little note titled "Serving tarball contents as part of your webspace." It was very short and appears in Listing One.
|
|
Sometimes solving little problems can be fun. You stare at the project requirements, then stare at the available tools and figure out how to bridge the gap from the tools to the problem solution. However, sometimes I get frustrated when I'm treading new ground, because the task needs to be done yesterday. So I'm always on the lookout for little snippets I can reuse for solving a particular style of problem. With this in mind, I'd like to share with you some snippets I hope you can reuse, since I spent a bit of time inventing them in the first place.
|
|
Who has time to make those cute little graphic buttons for their Web site -- especially when you're rede- signing it and are changing the text(or, perhaps the text varies sometimes)? Well, I was faced with that issue the other day while contemplating Yet Another Redesign for my Web site at perltraining.stonehenge. com. I wanted to include some "next" and "previous" buttons, but didn't want to spend a lot of time in some bitmap-drawing program coming up with them.
|
|
The other day, I was looking at rsync to set up the publishing of my Web site from a CVS-managed archive. I thought it would be simple to use rsync in "archive" mode to accurately mirror a staging directory. But I just couldn't get the hooks right. I also wanted to ignore specific differences and add mail notification for when certain pages were updated.
|
|
The folks at Red Hat recently selected the open source PostgreSQL database as the foundation for their commercial Red Hat Database product. This decision, however, was not made without a good deal of whining from the ranks of the MySQL faithful, who weren't able to fully comprehend why it was that their baby had been passed over.
|
|
The traditional File::Find module included with Perl is nice. I use it frequently. However, it's got this interesting feature -- or rather limitation; it wants to be in control until it has found everything of interest. Now, that's perfectly fine for most applications, but I've occasionally wanted to turn a hierarchical search "inside out." That is, I'd set up the search, including the start directories and the "wanted" criteria, and then repeatedly call some function to get the "next" item of interest. This is similar to how you could call the find program externally:
|
|
Recently on the Perl Monastery (http://www.perlmonks.org), the user known as ton asked about parsing a Perl-style double-quoted string, as part of a project to construct a safe Data::Dumper parser that would take output and interpret it rather than handing the result directly to eval. The work in progress for their Undumper was posted, and I commented that there was probably a simpler way to do some of the things and noted that it didn't handle blessed references.
|
|
The Apache Web server that handles the www.stonehenge.com domain logs its transactions directly to a MySQL database using a mod_perl handler. This is really cool, because I can perform statistical correlations on hits over the past few months, including such complex things as the average CPU time used for a particular URL (to see if some of my dynamic pages need better caching) and the greatest number of referrers to a particular page.
|
|
One of the things that distinguishes Perl as a powerful and practical tool in the Linux toolbox is its ability to wrangle text in interesting ways that makes it seem effortless. A majority of that ability can be attributed to Perl's very powerful regular expressions. Regular expressions are nothing new. I was using them with Unix tools in 1977, and I suspect they go back even further than that. But Perl continues to push the envelope of how regular expressions work; so much so that the GNU project includes a "perl-compatible-regular-expressions" library (PCRE) so that other tools can catch up to Perl!
|
|
In last month's column, I described a program that rips through screenit.com's database of movie reviews and extracts the "profanity" paragraphs, which detail how nearly 1,000 recent movies have used words that some might find offensive. This month, I'll look at a quiz engine that picks a movie from the database at random, presents the profanity paragraph, and requests a multiple-choice response to test your knowledge of which movie that paragraph is describing.
|
|
I have a pretty long list of "write a magazine article about this someday" items. But I could always use more, so if you want to see your name in print, please e-mail your ideas to me, and you'll be appropriately credited! One item that's been in there for nearly as long as I have been keeping a list is "show how to design an online quiz correctly so that people can't cheat." Why this? Well, far too often, I've seen "Web quiz" freeware that was all too trivial. The right answer was either guessable via staring at the mouseover URLs, or I could simply hit the "back" button and try a different answer if I got one wrong.
|
|
I wrote a Web page the other day and realized that I wanted footnotes. I wanted to keep the main message in the main text and have annotations for some of the side points. It's easy enough to do, right? Just put some text in a table at the end, use those cute little sup tags around the footnote numbers, and hack away.
|
|
Suppose my friend Fred has a Web site that has grown too big for him to handle by himself. So he gets his buddy Barney to create some of the HTML and draw up a few of the images. How can Barney edit the files on Fred's hard drive, especially if Barney is on the wrong side of some corporate firewall? Well, Fred could create a CGI script to upload the files into the right place. However, then the script runs as the Web user and not as Fred. This would require Fred to mess with wide-open permissions (or setuid wrappers) and either https authentications or (worse) repeatedly sending the update password over the wire during Basic Authentication handshaking.
|
|
I find myself spending a lot of time participating in online discussion areas. Originally, all we had was Usenet. However, the concept of a "Web-based community" has finally taken hold. These communities usually provide some sort of message-based system (often with threading and separate discussion areas for topics) and frequently an HTML or Java-based "interactive chat" area.
|
|
Most Perl scripts aren't doing anything glamorous. They're the workhorse of your system, moving things around and handling those mundane repetitive tasks while you aren't necessarily looking. Those tasks are often on a series of filenames, perhaps not known in advance but obtained by looking at the contents of a directory. Perl has a few primary means of getting lists of names, so let's take a look at them.
|
|
Dynamic content on a Web site is cool. It keeps people coming back and creates the appearance that there are people behind the scenes actively updating the site to provide new and improved information. However, in the real world, you'll regret the day you added that SSI include directive in your homepage when you finally do something important enough that Slashdot notices you. Running a CGI script on every single Web hit is a great way to get your Quake game interrupted.
|
|
Well, as a nice follow-up to last month's column, "Determining Text Comprehensibility," about how downright unreadable the good old online documentation can get sometimes, let's play around a bit with automated translators. You're probably familiar with the Babelfish Translator at Altavista (http://babelfish.altavista.com). It's a nice demonstration of some human-language machine translation. You simply type in (or paste in) a short chunk of text, select from a dozen different pairs of languages, and get an approximate translation in a few seconds.
|
|
Ahh, manpages. Some of them are great. But, a few of them are just, well, incomprehensible. So I was sitting back a few days ago, wondering if there was a way to locate the really ugly ones for some sort of award. Then I remembered that I had seen a neat module called Lingua::EN::Fathom that could compute various statistics about a chunk of text or a file, including the relative readability indices, such as the "Fog" index. The "Fog" index is interesting in particular because it was originally calibrated to be an indication of "grade level," with 1.0 being "first grade text" and 12.0 being "high school senior." At least that's the way I remember it.
|
|
The Web server for www.stonehenge.com is a nicely configured Linux box (of course) located at a nice co-location facility and maintained by my ISP. I share the box with a dozen other e-commerce clients, and that keeps me and everyone else on our toes about overloading the server, because we all have to share. I bought a digital camera some large number of months ago and started putting nearly every picture I took up on the site. I've got a nice mod_perl picture handler to show the thumbnails, provide the navigation, and even generate half-size images on the fly using PerlMagick.
|
|
Have you ever gone out into the workshop to make something interesting, only to find that the workbench you want to use is too short or long or not high enough? Or maybe it doesn't have clamps in the right places or it's just too uneven? So then you sit down and spend some time first creating a good workbench, in the hope that this will support (literally) your work in creating the thing you had started out to make.
|
|
In last month's column, I presented a framework to allow many parallel tasks to be performed efficiently, in anticipation of using that framework as-is for this month's program: a parallel web-site link checker. Well, wouldn't you know it? After writing the rest of the code, I found that I had left out some of the needed hooks. And, while I was on a boat for Perl Whirl 2000 (the first Perl conference on the high seas), I thought of more things I could add to the process manager for the forked-off processes. So, after much gnashing of teeth, I used all of my skills of random feature creep and cut-n-paste, urgh, I mean, code reuse to create the monster in Listing One (pg. 94).
|
|
More and more Web-hosting services and ISPs are providing CGI space in addition to customer Web pages, either as a free add-on, or an extra-cost service. And there are even a few free CGI servers out there on the Net. The problem with these services is that the (shared) Web error log is often inaccessible, or at an unknown location. That's fine if your CGI program never commits an error, or if you are using the PSI::ESP module to determine the error text. But most of us will write "blah blah or die blah" in our CGI scripts, expecting to somehow be told what's wrong when it goes wrong.
|
|
Several months ago in this space, I talked about how my ISP was looking at the performance of their news server. I wrote a program to see just how bad the news service was compared to the other local ISPs, using Deja as a baseline. Well, the ISP just got bought out by a big national chain. They decided not to fight the spotty news service any more and just convert over to the conglomerate's big service. The problem with moving from one news server to another is that the article numbers are not in sync, so a .newsrc file will have the right newsgroups but the wrong "read" marks. And since I read a lot of newsgroups, I don't have time to reread existing articles, and I don't want to just throw away any new articles.
|
|
Last month's column was a brief tutorial introducing the concept of objects in Perl. We covered some of the basic concepts of object-oriented programming, including class methods and inheritance. We learned how to factor out and reuse common code with variations. This month we'll learn how to create instance data, which is information associated with one particular object.
|
|
In the past three columns, I looked at using "references" in Perl. References are an important part of capturing and reflecting the structure of real-world data -- for example, a table of employees, each of whom has various attributes, can be represented as an array of hashrefs, pointing at attribute hashes for each employee.
|
|
In the past two columns, I looked at using references in Perl and showed the basic syntax for creating references to arrays, hashes, scalars, and subroutines. I also described the canonical form of converting a non-reference expression into a reference, and how to use the shortcut rules to make this simpler.
|
|
You need references. Everybody programming in Perl does, since they are one of the basics of the language. A bit like C's pointers, references can be used to refer to all sorts of other things, including scalars, arrays, hashes, filehandles, typeglobs, subroutines, and synthetic data structures. If C calculates addresses and dereferences pointers with & and *, respectively, Perl does much the same with \ and $.
|
|
My Web server for http://perltraining.stonehenge.com is on a nicely configured shared Linux box at a 24x7-manned co-location facility. While I'm not really system administrator for this box, I still want to be sure that my Web things aren't bogging the system down unnecessarily. (If that happens, the other e-commerce users will start rallying to kick me off.) This is especially true as I experiment more with dynamically generated pages and toys for columns like this one.
|
|
Usenet news has been around since 1979. I've been reading news nearly daily since 1980, except for a brief hiatus in 1984 when I missed the "great renaming" that gave us our current Usenet naming scheme. Because news is important (and familiar) to me, it's important for me to read news from a news server that has fairly decent article coverage.
|
|
Perl has many ways of launching and managing different programs. This is a Good Thing, because Perl's ability to launch and manage programs -- or child processes -- is one of the reasons it makes such a great
"duct-tape of the Internet." The easiest way to launch a child process is with system:
|
|
I'm looking for a Perl script that uses recursive grammar techniques to generate random sentences. I've found several scripts that will throw up a string of text chosen from a pre-made list, but I'd really like to find something that generates sentences on the fly.
|
|
According to the folks who
survey such things, the Open Source Apache server is the most popular
Web server on the Internet. And Perl is the language of choice for
many scripts running on all those Apache servers. But
if you really want to get the most out of Perl and Apache, you need to
embed Perl directly into your server using Apache's
mod_perl extension.
|
|
Perl has a lot of cool stuff. Certainly, the basic: print "Hello, world!\n"; gets people started without knowing much about the language, but the question "Is there a way to do (X) in Perl?" can usually be answered "Yes!"
|
|
The Perl community is one of the most well-established demonstrations of the
Open Source Software movement. Many people that have benefited from Perl's
openness have in turn contributed libraries and scripts back to the public for
others to use. The collective contributions to the Perl community have been
organized into the Comprehensive Perl Archive Network, known more commonly as the
CPAN.
|
|
|