File size discrepancy


#1

I run Debian as does DH. When I upload my files image files on both
machines are of the same size but html files on DH are always smaller
than those on mine e.g. 9717 bytes vs 9509 bytes.
Is there any explanation for this?
Regards.


#2

I just checked, my file sizes are all the same. The files on my Debian box at home, on my VPS and here at DH have the same size. Does this happen to files you’ve just uploaded? What if you download them to you local system? Do they still have the DH size?

Jan

Promo-Code: [color=#CC0000]SAVEMONEY97[/color] - Save [color=#CC0000]$50[/color] on your first year of hosting.
Get more promo codes here


#3

Downloaded above file from DH but now my local file is down to 9509
bytes to match DH.

$diff dh-file.htm local-file.htm [outputs 418 lines]
$nl dh-file.htm [outputs 190 lines]
$nl local-file.htm [outputs 208 lines]

Very odd since both pages display correctly and are similar to each
other in the browser.

Regards.


#4

I’m sure you are aware of this problem with transfering text files but it begs the question what software are you using locally that is using CR+LF instead of just LF?

:cool: openvein.org -//-


#5

Thanks for the response.

I am aware of the CR/LF diff but I would never suspect Stallman of
coding emacs to output CR+LF! Don’t think it is possible to ‘flip’ an html file.

What’s interesting is that when I downloaded the dh-file.htm in the
morning it appeared to be different from local-file.htm and gave the
above differences but when I checked files several hours later (after
shutdown) both files appeared to be identical with ‘diff’ remaining
silent.

Than I downloaded dh-file.htm again and the problem repeats as above
with wc outputting following:

$wc dh-file.htm [208 875 9509]
$wc local-file.htm [208 875 9717]

I’m not sure if this is the right forum to raise this question but any
clues would be much appreciated.

Regards.


#6

[quote]I am aware of the CR/LF diff but I would never suspect Stallman of
coding emacs to output CR LF! Don’t think it is possible to ‘flip’ an html file.

$wc dh-file.htm [208 875 9509]
$wc local-file.htm [208 875 9717][/quote]

[quote]I’m not sure if this is the right forum to raise this question but any
clues would be much appreciated.[/quote]
You asked for an explanation in the difference in file sizes for HTML files. I’ve pointed out that your copy is using CR LF and the DreamHost copy is using LF. The output above proves that (the line count is equal to the difference in byte count).

The .htm extension leads me to believe your files came from Windows computers somehow. You do realize that emacs has a DOS mode, where it will read files with CR LF and when saving them will write the CR LF back out, don’t you?

Regardless obviously on either machine you can use dos2unix and unix2dos to translate the line terminators as necessary and you need to check with your file transfer client because its obviously translating the line terminator during transfer ie (“ASCII” mode versus binary - which is why your image files don’t corrupted)

And if you still need an example that its due to CR LF then tar or zip your copy then upload it and untar or unzip and browse to it using http://web-sniffer.net/ with “Raw HTML view” checked and it will indicate non-printing characters such as CR and LF when present.

:cool: openvein.org -//-


#7

Atropos7, you have solved the problem! Thanks for persevering.
Following on background for anyone who has been following this.

  1. The .htm file extension is not significant since I use that to
    save ONE byte (some of us are tightwads).

  2. I have been using emacs to format my html files since well before
    release of etch so I could not understand why my files would be saved
    with CR+LF. But I was not aware that emacs would preserve this
    convention on newly created files.

  3. To save work I use a template originally created with Dreamweaver3
    on Win98 and simply do ^XW to get the new file. Now I understand that
    emacs simply preserves the original CR+LF in my template.

  4. When I open saved html files from other sources in emacs to get
    rid of the scripts the CR+LF show up as ^M - it is the absence of
    these on my own files that made me think I was in unix mode and not
    dos.

Thanks again for your help.