Perl Form Upload Script Source

software development

#1

OK in another thread, a poster asked why the script he was using was not working. However that script is a really poor example of how to upload a file using a web page, and I said I’d re-write my previous example for him post the result here. This script is provided by myself as an example.

In addition this script does not address security issues and should not be used as-is.

Installation instructions:

  1. create the directory where you want uploaded files to go and write
    down the full filesystem path. permissions on directory should be 755.
  2. put script in domain with CGI enabled
  3. chmod script file to 755
  4. edit script variables
    a. on DreamHost, must use ‘/home/username/path’ for filesystem path
  5. reference it in a form with <form method="POST" action="script url" enctype="multipart/form-data">----------[code]#!/usr/local/bin/perl

released into public domain.

this script handles only 1 file upload field

use CGI;
use CGI::Carp qw(fatalsToBrowser);
use File::Copy;

1 = error messages using CGI::Carp

my $DEBUG = undef;

url - must include trailing slash

my $uploads_url = ‘http://yourdomain/uploads/’;

filesystem path - do not include trailing slash

my $upload_path = ‘/home/username/yourdomain/uploads’;

my $upload_error = ‘http://yourdomain/upload.html’;
my $upload_success = ‘http://yourdomain/upload.html’;
my $file_field = ‘file’;

my $CGI = new CGI;
my ($filename, $file, $filehandle, $filesize);

if ($filehandle = $CGI->upload($file_field)) {

get filename from form

first regexp: remove leading windows path ‘d:\path\filename’ = ‘filename’

second regexp: remove leading linux path ‘/path/filename’ = ‘filename’

third regexp: replace whitespace with underscore

$filename = $CGI->param($file_field);
$filename =~ s/^.\//;
$filename =~ s/^.
///;
$filename =~ s/\s /_/g;
$file = $upload_path . ‘/’ . $filename;

if (length $filename > 0) {

copy the temporary file to destination

binmode($filehandle);
seek($filehandle, 0, 0);
if (copy($filehandle, $file) > 0) {
close($filehandle);
$filesize = -s $file;
1 while $filesize =~ s/(\d)(\d{3})(?!\d)/$1,$2/;
print
$CGI->header,
&success_page($filename, $filesize);
}
else {

could not copy the temporary file to $file

die(“Could not copy filehandle to $file: $!”) if $DEBUG == 1;
print $CGI->redirect($upload_error);
}
}
else {

filename was blank or invalid format

die(“Filename was blank or invalid format: ‘$filename’”) if $DEBUG == 1;
print $CGI->redirect($upload_error);
}
}
else {

CGI::upload() returned undef

which means no file was uploaded or parsing error

die(“No file was uploaded or parsing error”) if $DEBUG == 1;
print $CGI->redirect($upload_error);
}

exit;

sub success_page {
my $url = $uploads_url . $filename;
return <<HTML;

Form Upload

Thank you!


Your file has been successfully transferred to this site.

Filename: $filename< br > Size: $filesize bytes< br > URL: $url< br >

Click here to continue.

_HTML_ }[/code]------------

:cool: Perl / MySQL / HTML CSS


#2

That’s great of you to share your code but please do some basic security checking/audit on your code in future. I can see a few potential pitfalls and I really don’t fancy promoting potentially damaging code to fellow hosters here.

  • wil

#3

I’ve edited the original post to reflect why I posted it. It was meant to simply replace the code in this post in another thread which was a really bad example of how to handle a file upload.

I’d appreciate hearing more about security as well. One would definitely want to have a limit on how much data the script would accept, as well as actually better checking of the filename and whether or not it should be able to overwrite files.

:cool: Perl / MySQL / HTML+CSS


#4

Hi

Sorry I haven’t revisited this since posting. Thanks for updating the description, it just makes me sleep a little better at night. :slight_smile:

I haven’t looked through your code thoroughly, but I’ll say what jumps into my head from giving your code the once-over:

  1. use strict;
    In this scneario there’s no excuse for not using strict. You should use it. It will save you on headaches later on, especially if you’re extending the code in some way.

  2. Restrict what files the user can upload.
    Using built-in CGI.pm methods, you can find out what the file actually is. This is a good trick to remember:

[code] my $filename = $q->param(‘file_upload’);

my $type = $q->uploadInfo($filename)->{‘Content-Type’};
[/code]This will tell you what content the file is holding. Not a foool-proof way, of course, but a good indicator. Probably more reliable than checking for file extensions such as .jpg or .gif.

  1. You don’t need binmode on a linux box. On (li)nux a file is a file regardless of if it holds binary data inside or not. So this line isn’t needed and again, you can take advantage of CGI.pm built in functions here to get your filehandle:

[code] my $fh = $q->upload(‘file_upload’);

my $filename = shift;

open (FH,">/home/safe/place/$filename") or die $!;
while (<$fh>) {
print FH $_;
}
close FH;
[/code]Hope this helps. Sorry I haven’t got more time to go through this right now.

  • wil

#5

Due entirely to your gracious help, I have a mostly working version of your script running at http://ykfp.org/cgi-bin/upload3.cgi

I changed it to generate the form with the script using function-based calls to CGI.pm because that’s what I’ve been trying to learn. But I still have two issues I can’t figure out:

  1. Where is that Status: 302 or Content-type line below the submit button coming from?
  2. A mis-typed file name still goes to the upload success section, since filename > 0. I wonder how I might trap this error.

#!/usr/local/bin/perl

released into public domain.

this script handles only 1 file upload field

use strict;
use CGI qw(:standard);
use CGI::Carp qw(fatalsToBrowser);
use File::Copy;

1 = error messages using CGI::Carp

my $DEBUG = undef;

url - must include trailing slash

my $uploads_url = ‘http://ykfp.org/uploads/’;

filesystem path - do not include trailing slash

my $upload_path = ‘/home/ykfpdata/ykfp.org/uploads’;

my $upload_error = ‘http://ykfp.org/cgi-bin/upload3.cgi’;
my $upload_success = ‘http://ykfp.org/cgi-bin/upload3.cgi’;
my $file_field = ‘file’;

my $CGI = new CGI;
my ($filename, $file, $filehandle, $filesize);

print header (),
start_html (-title => “File Upload”, -bgcolor => “white”);

print start_multipart_form (-action => url ()),
"Your file: ", br (),
filefield (-name => “file”, -size => 90),
br (),
submit (-name => “choice”, -value => “Submit”),
end_form ();

if ($filehandle = $CGI->upload($file_field)) {

get filename from form

first regexp: remove leading windows path ‘d:\path\filename’ = ‘filename’

second regexp: remove leading linux path ‘/path/filename’ = ‘filename’

third regexp: replace whitespace with underscore

$filename = $CGI->param($file_field);
$filename =~ s/^.\//;
$filename =~ s/^.
///;
$filename =~ s/\s /_/g;
$file = $upload_path . ‘/’ . $filename;

if (length $filename > 0) {

copy the temporary file to destination

binmode($filehandle);
seek($filehandle, 0, 0);
if (copy($filehandle, $file) > 0) {
close($filehandle);
$filesize = -s $file;
1 while $filesize =~ s/(\d)(\d{3})(?!\d)/$1,$2/;
print
$CGI->header,
&success_page($filename, $filesize);
}
else {

could not copy the temporary file to $file

die(“Could not copy filehandle to $file: $!”) if $DEBUG == 1;
print $CGI->redirect($upload_error);
}
}
else {

filename was blank or invalid format

die(“Filename was blank or invalid format: ‘$filename’”) if $DEBUG == 1;
print $CGI->redirect($upload_error);
}
}
else {

CGI::upload() returned undef

which means no file was uploaded or parsing error

die(“No file was uploaded or parsing error”) if $DEBUG == 1;
print $CGI->redirect($upload_error);
}

exit;

sub success_page {
my $url = $uploads_url . $filename;
return <<HTML;

Form Upload

Thank you!


Your file has been successfully transferred to this site.

Filename: $filename Size: $filesize bytes

Click here to continue.

_HTML_ }

#6

For image files one can use the Image::Size module. In addition to the dimensions it also sniffs GIF, JPG, XBM, XPM, PPM family (PPM/PGM/PBM), XV thumbnails, PNG, MNG, TIF, BMP, PSD, SWF, SWC, PCD. Hmm, well at least with the version on my machine.

And I imagine one should also be using taint mode, ala Perl 5.6 Documentation: perlsec

Also according to the CGI.pm documentation one might want to use something like $CGI::POST_MAX=1024 * 100; # max 100K posts $CGI::DISABLE_UPLOADS = 1; # no uploadsForgot about binmode() - I develop my scripts locally on WinXP and usually only have to worry about making sure not to call flock() when running under Win32.

Tried to use File::Copy as an example of not having to re-invent the wheel, kinda like using CGI.pm, though of course it is overkill.

:cool: Perl / MySQL / HTML+CSS


#7

The script as originally written is meant to perform a single function of recieving the file that is being uploaded. But now you want the script to handle a mutually exclusive l function of showing the form page. So you need to add some logic that says “If I am not recieving a file, I should show the form page.” As you have it the logic is “Show the form and recieve a file.” And since no file is being submitted, the script is trying to indicate there was an error. Since you sent the form web page out first, the redirect to an error page is being shown as part of the web page.

In other words you need an if statement like: [code]if ($ENV{‘REQUEST_METHOD’} eq ‘POST’}) {

uploading a file?

} else {

definitely not uploading a file, show the form

}[/code]

You can tell I don’t use this type of script often. It turns out that if you mis-type a filename, CGI module will still open a valid filehande even though the documented behaviour is to return an undefined value. However the file will be zero-length. Logic should be:

[code]$filename = $CGI->param($file_$field);
$filehandle = undef;
$filesize = 0;
if (length $filename > 0) {

file field was not blank

get filesize by going to end of the file

and then asking how far it is from beginning of file

$filehandle = $CGI->param($file_$field);
seek($filehandle, 0, 2);
$filesize = tell($filehandle);
if ($filesize == 0) {

mis-typed file name -OR- the file was zero-length to begin with

}
else {

a file was definitely uploaded

copy($filehandle, $file);
}
}
else {

file field was blank

}[/code]Mozilla 1.6a and IE 6.0 only include “Content-Type” and “Content-Disposition” headers for file uploads. Not only that, but Content-Disposition will still indicate that its a file field and include the value as a ‘filename’. So $CGI->uploadInfo won’t help determine if a file was uploaded or not.

:cool: Perl / MySQL / HTML+CSS


#8

Thanks for all your help. I’ve made some progress on the script and learned a lot. I compartmentalized the form display and the form processing into subroutines, and made use CGI calls as functions. I added an if test to see if the filename is mistyped by checking the file length as you suggested but I don’t have that test doing anything helpful right now. I can see it would be nice to also use this file length check to limit the size of files I want to allow for uploads. Also the onerror and onsuccess urls just go back to the same script for now.

I discovered that I can’t handle versioning on DH like I can on my own Solaris box and my own install of Apache I use for testing. I usually name my files with a serial number like upload4.cgi and upload5.cgi and make a symbolic link called upload.cgi linked to my best working version. That way I don’t always have to make sure that all the self calls inside a version have been up dated. I’ll just leave them set to upload.cgi. But it didn’t work on DH. I think they don’t use that FollowsymLinks directive in their Apache config.

But I still have one puzzle I can’t figure out: my success_page routine adds a stray line “Content-Type: text/html; charset=ISO-8859-1” at the top of the html page. I see that that subroutine prints out html tags directly, copied from your example, rather than the CGI calls I used in the form display. But I don’t se why I’m getting that stay line. See: http://ykfp.org/cgi-bin/upload5.cgi

#!/usr/local/bin/perl

released into public domain.

this script handles only 1 file upload field

use strict;
use CGI qw(:standard);
use CGI::Carp qw(fatalsToBrowser);
use File::Copy;

1 = error messages using CGI::Carp

my $DEBUG = undef;

url - must include trailing slash

my $uploads_url = ‘http://ykfp.org/uploads/’;

filesystem path - do not include trailing slash

my $upload_path = ‘/home/ykfpdata/ykfp.org/uploads’;

my $upload_error = ‘http://ykfp.org/cgi-bin/upload5.cgi’;
my $upload_success = ‘http://ykfp.org/cgi-bin/upload5.cgi’;
my $file_field = ‘file’;

my $CGI = new CGI;
my ($filename, $file, $filehandle, $filesize);

print header (),
start_html (-title => “Submit your file to YKFP”, -background => “…/watback.jpg”);

Dispatch to proper action based on user selection

#@ DISPATCH
my $choice = lc (param (“choice”)); # get choice, lowercased

if ($choice eq “”) # initial script invocation
{
display_upload_form ();
}
elsif ($choice eq “submit”)
{
process_form ();
}
else
{
print p (escapeHTML (“Logic error, unknown choice: $choice”));
}
#@ DISPATCH

print end_html ();

exit (0);

#Display the uploading form
#@ DISPLAY_UPLOAD_FORM
sub display_upload_form
{
print start_multipart_form (-action => url ()),
"Submit your file to YKFP: ", br (),
filefield (-name => “file”, -size => 90),
br (),
submit (-name => “choice”, -value => “Submit”),
end_form ();
}
#@ DISPLAY_UPLOAD_FORM

#@ PROCESS_FORM
sub process_form
{
if ($filehandle = $CGI->upload($file_field)) {

get filename from form

first regexp: remove leading windows path ‘d:\path\filename’ = ‘filename’

second regexp: remove leading linux path ‘/path/filename’ = ‘filename’

third regexp: replace whitespace with underscore

$filename = $CGI->param($file_field);
$filename =~ s/^.\//;
$filename =~ s/^.
///;
$filename =~ s/\s /_/g;
$file = $upload_path . ‘/’ . $filename;

$filename = $CGI->param($file_field);
$filehandle = undef;
$filesize = 0;

if (length $filename > 0) {

file field was not blank

get filesize by going to end of the file

and then asking how far it is from beginning of file

$filehandle = $CGI->param($file_field);
seek($filehandle, 0, 2);
$filesize = tell($filehandle);
if ($filesize == 0) {

mis-typed file name -OR- the file was zero-length to begin with

}
else {

a file was definitely uploaded

copy the temporary file to destination

binmode($filehandle);
seek($filehandle, 0, 0);
if (copy($filehandle, $file) > 0) {
close($filehandle);
$filesize = -s $file;
1 while $filesize =~ s/(\d)(\d{3})(?!\d)/$1,$2/;
print
$CGI->header,
&success_page($filename, $filesize);
}
else {

could not copy the temporary file to $file

die(“Could not copy filehandle to $file: $!”) if $DEBUG
== 1;
print $CGI->redirect($upload_error);
}
}
}
else {

filename was blank or invalid format

die(“Filename was blank or invalid format: ‘$filename’”) if $DEBUG == 1; print $CGI->redirect($upload_error);
}
}
else {

CGI::upload() returned undef

which means no file was uploaded or parsing error

die(“No file was uploaded or parsing error”) if $DEBUG == 1;
print $CGI->redirect($upload_error);
}
}
#@ PROCESS_FORM

exit;

sub success_page {
my $url = $uploads_url . $filename;
return <<HTML;

Form Upload

Thank you!


Your file has been successfully transferred to this site.

Filename: $filename Size: $filesize bytes

Click here to continue.

_HTML_ }

#9

[quote]
But I still have one puzzle I can’t figure out: my success_page routine adds a stray line “Content-Type: text/html; charset=ISO-8859-1” at the top of the html page.[/quote]
It’s not the success_page routine, but where you are calling it from: print $CGI->header, &success_page($filename, $filesize);No need to call $CGI->header() again as you already called it at the beginning of the script, where you did this:print header (), start_html (-title => "Submit your file to YKFP", -background => "../watback.jpg");See, I always use the object method way of using CGI module but you started using the imported functions method. You might want to edit the code to use one method or the other but not both to avoid more headaches.

:cool: Perl / MySQL / HTML+CSS


#10

How about a check to see if the filename trying to be uploaded already exists on the server? Could prove useful.

  • wil

#11

Well, here is a near complete re-write that includes

  1. configurable overwrite and zero-length checking
  2. trapping POSTs that are too large
  3. partial support for taint mode: set PATH and untaint filename
  4. changed the way errors are handled, looks better but not useful for debugging
  5. show upload form again on errors
  6. configurable name and value for the upload form’s submit button

[code]#!/usr/local/bin/perl

released into public domain.

this script handles only 1 file upload field

use strict;
use CGI qw(:standard);
use CGI::Carp qw(fatalsToBrowser);
use File::Copy;

Configuration

$CGI::POST_MAX = 1024 * 1024; # limit POSTs to size in bytes

set these to undef to disable, defined to enable

my $OVERWRITE = undef; # allow overwriting existing files
my $ZEROLENGTH = undef; # allow zero-length files

url - must include trailing slash

my $uploads_url = ‘http://ykfp.org/uploads/’;

filesystem path - do not include trailing slash

my $upload_path = ‘/home/ykfpdata/ykfp.org/uploads’;

my $upload_success = ‘http://ykfp.org/cgi-bin/upload5.cgi’;

my $file_field = ‘file’; # file upload input name
my $submit_field = ‘choice’; # submit input name
my $submit_value = ‘Upload’; # submit input value

Initialization

$ENV{PATH} = “/bin:/usr/bin”; # set path to known value

output page header

print
header(),
start_html (-title => “Submit your file to YKFP”, -background => “…/watback.jpg”);

check if the form is being submitted as POST,

otherwise show form

if (request_method() eq ‘POST’) {

check to see if file upload was too large,

otherwise check to see if correct form was submitted

if (cgi_error() =~ /too large/) {
my $pretty_limit = $CGI::POST_MAX;
1 while $pretty_limit =~ s/(\d)(\d{3})(?!\d)/$1,$2/;
my $pretty_length = $ENV{‘CONTENT_LENGTH’};
1 while $pretty_length =~ s/(\d)(\d{3})(?!\d)/$1,$2/;
print
h2(“Error”),
p("Your upload was $pretty_length bytes, which is over the limit of $pretty_limit bytes. "),
hr();
}
else {

check if form was submitted

if (param($submit_field) eq $submit_value) {
&process_form;
}
else {
print
h2(“Error”),
p(“Incorrect form was submitted.”),
hr();
&display_form;
}
}
}
else {
&display_form;
}

exit;

Subroutines

sub display_form {
print
start_multipart_form(-action => url ()),
"Submit your file to YKFP: ",
br (),
filefield (-name => $file_field, -size => 90),
br (),
submit (-name => $submit_field, -value => $submit_value),
end_form ();
return;
}

sub process_form {
my $filename = param($file_field); # get filename that was provided
$filename =~ s/^.\//; # remove windows path
$filename =~ s/^.
///; # remove linux path
$filename =~ s/\s+/_/g; # replace whitespace with underscore

if no filename was provided, show error message

if (length $filename == 0) {
print
h2(“Error”),
p(“No filename was provided.”),
hr();
&display_form;
return;
}

if filename could not be untainted, show error message

if ($filename =~ /^([-@\w.]+)$/) {
$filename = $1;
}
else {
print
h2(“Error”),
p(“Filename contains invalid characters.”),
hr();
&display_form;
return;
}

my $filehandle = upload($file_field); # filehandle to temporary file

get the filesize by seeking to end of temporary file

seek($filehandle, 0, 2);
my $filesize = tell($filehandle);

if zero-length files not allowed, show error message

note: this is only way to trap misspelled filenames

if ($filesize == 0 and not defined $ZEROLENGTH) {
print
h2(“Error”),
p(“The file you uploaded was empty. Please check the spelling and try again.”),
hr();
&display_form;
return;
}

my $file = “$upload_path/$filename”; # generate full filename

if (-e $file and not defined $OVERWRITE) {
print
h2(“Error”),
p(“The file already exists and may not be overwritten.”),
hr();
&display_form;
return;
}

seek($filehandle, 0, 0);
if (copy($filehandle, $file) == 0) {
print
h2(“Error”),
p(“An error occured when writing the file to disk.”);
return;
}

my $url = $uploads_url . $filename;
my $pretty_filesize = $filesize;
1 while $pretty_filesize =~ s/(\d)(\d{3})(?!\d)/$1,$2/;
print
h1(“Thank you!”),
p(“Your file has been successfully transferred to this site.”),
blockquote(
strong("Filename: "), $filename, br(),
strong("Filesize: “), $pretty_filesize, " bytes”, br(),
strong("URI: "), a({-href=>$url}, $url)),
hr(),
p(a({-href=>$upload_success}, “Click here to continue.”));

return;
}[/code]

:cool: Perl / MySQL / HTML+CSS


#12

Yes, it hasn’t happened yet, but I was thinking “what if two different contractors submit their proposals both named ‘ourproposal.doc’?” I tested that and found that the second upload would overwrite the first with no problem, no warnings. I guess I would have had that problem too on my old host when using anonymous ftp for proposal submissions, but didn’t encounter it.


#13

Well, thanks. I’ll be able to give this script a try later today. I need to deal with paperwork for my accounting office first.


#14

A simple if (-e $file) would check to see if the file (-e)xists on the filesystem. You can then either ask to overwrite or to rename the file dynamically on the server? Just a thought.

Good work.

  • wil

#15

I put the script at http://ykfp.org/cgi-bin/upload6.cgi, and it tests out nice. Files are dumped into a ykfp.org/uploads directory that I password protected with .htaccess.

I guess the style of 1024 * 1024 used for POST_MAX is to make it easier to specify the file size in kbytes or mbytes if I added one more * 1024.

I didn’t know about POST_MAX until I read up on it at perldoc.com. It looks like a good idea in lots of situations to avoid denial of service attacks.

This stuff is pretty new to me, and it’s exciting that all this stuff I thought was just web magic is easily handled with calls to CGI.pm. I’ve been working with “MySql and Perl for the Web” by Dubois, and “Programming Perl” by Wall, Christiansen, and Orwant. Is there so good starting manuals focused on CGI?

Is the string to define $OVERWRITE and $ZEROLENGTH just def then? or defined.


#16

That and I couldn’t remember the base 10 representation of 2^20. If it helps, you might want to use constants instead:

sub KILOBYTE { 1024; }
sub MEGABYTE { 1024 * KILOBYTE; }
sub NO_LIMIT { -1; }

$CGI::POST_MAX = 5 * MEGABYTE; # 5 megabytes

$CGI::POST_MAX = NO_LIMIT; # no limit

No, sorry it just that programming conventions were switched from earlier versions. When setting a variable to undef, you are saying it has no data value, and calling defined() on the variable will return boolean FALSE. Any other assignment would make the variable defined.
However testing for “definedness” is usually done for indicating an error, lack of data, or no initial value.

Usually when perl programmers use a variable as switch they use 0 to mean ‘off’ and 1 to mean ‘on’. And instead of testing for “definedness”, you must test if the value is 0 or 1.

my $OVERWRITE = 0; # 0 = disable overwriting existing files, non-zero to enable
my $ZEROLENGTH = 0; # 0 = disable saving empty files, non-zero to enable

And then change the if statements to:

if ($filesize == 0 and $ZEROLENGTH == 0)
if (-e $file and $OVERWRITE == 0) {

Usually perl considers the value 0 to be boolean FALSE, however this is not always the case. A few system functions may return ‘0, but true’ which perl will evaluate as value 0 but boolean TRUE. The same applies for ‘0E0’, if assigned as a string value and not a numerical value. But most of the time you can do something like

if ($filesize == 0 and not $ZEROLENGTH)
if (-e $file and not $OVERWRITE)

:cool: Perl / MySQL / HTML CSS