Thread

System 6 Browser Development Opinions

System 6 Browser Development Opinions Development 322 posts Aug 5, 2008 — Sep 2, 2010

#31 Thu, 7 Aug 2008 - 10:32

I'm assuming that paws is referring to a LISP like notation, which is fairly straight forward to parse since almost anything can be part of a token (except parentheses IIRC).

Yeah, S-Expressions. Dead simple to parse... and read, once you get the hang of it.

On the other hand, a structure like the UNIX mbox format may be more appropriate. Particularly if you have the server side do a bit of the work and particularly if you are expecting someone else write the client side software. (mbox is a series of fields, with space for a plain text message. Conceptually, that will be easier for most programmers to understand.)

It could use XML, it's not really important - except for memory use, where XML would lose like a big pile of shit in a smell nice contest.

I've no clue about hte internals of phpBB, but I think it might even be possible to adapt it? Like, instead of returning HTML, it'd return the data in whatever markup when the client identified itself as 68kMLAsys6client, or something. Keep the database, keep everything - but avoid doing screen scraping.

I don't know PHP and I took a vow a good while back I'd never learn it, but it seems doable.

#32 Thu, 7 Aug 2008 - 13:43

:b&w:

Isn't the problem with this not the User Interface on the Mac (Text or real graphics) but the lack of a TCP/IP Stack in System 6?

#33 Thu, 7 Aug 2008 - 15:16

Isn't the problem with this not the User Interface on the Mac (Text or real graphics) but the lack of a TCP/IP Stack in System 6?

No. MacTCP exists for system 6. There are many internet applications that run on System 6.

#34 Thu, 7 Aug 2008 - 18:10

Try looking at it from Lynx... it's fairly horrible, but can be used it you're determined. I think that's a better approximation of a minimal System 6 browser.

A better approximation, sure. But something as simple as modifying font size and weight can make an almost unreadable page in lynx much more readable. And that is within the capabilities of a System 6 Mac.

I've popped into the forums with elinks from time to time. I've found that with lynx and elinks the biggest problem is the textarea field where we type our messages. One problem is that they often fall across the screen boundary, which makes it hard to review what you've written. This isn't as much of a problem on a 68k Mac because it is easier position the page so that the field is contained by the screen (as long as the browser ensure that the field cannot be larger than the screen).

elinks has another solution to this problem, on that may be viable for any browser: you can pull up a full screen editor for some form fields. Imagine hitting a key and a textedit window (the API, not the program) pops up. Rather than editing in a portal that is meant to fit into a web page, you are now using a proper editor. Since many sites support similar markup for messages, you may even be able to make a WYSIWYGish text editor that automatically dumps the markup into the textarea field when you close the window.

#35 Thu, 7 Aug 2008 - 20:28

I was thinking that all forms beyond a single input and a submit button would be replaced with an icon that when clicked would pop up a modal dialog of the form. What do you think of that idea?

#36 Thu, 7 Aug 2008 - 21:18

You know, I think a decent approach might be to write a bare basics text only browser that has room for expansion (maybe via some sort of plug-in deal). I say this for two reasons...

1) If you were to try for tabbed browsing, forms, graphics and flash in v.1 it would probably never get off the ground and

2) A System 6 browser that ends up being a memory hog isn't going to be of much use. I'd suggest that basic functionality on a 1 Meg Plus would be a desirable goal.

Of course I'm probably talking out of my arse here, but I think that support for third-party plug-ins would allow the user to customize the browser within the confines of their Mac's limitations and it might also be a way to get more people working on the project.

#37 Thu, 7 Aug 2008 - 21:27

Isn't the problem with this not the User Interface on the Mac (Text or real graphics) but the lack of a TCP/IP Stack in System 6?

I think the biggest obstacle may be the scarcity of ethernet options for System 6 hardware... it just occurred to me that since I no longer have a dial-up internet account, I have no way of getting any of my compact Macs online (other than dialing in to a shell account, in which case I don't need a browser on the Mac).

That's not intended as an argument against the project, just an observation.

#38 Thu, 7 Aug 2008 - 21:31

I think the biggest obstacle may be the scarcity of ethernet options for System 6 hardware.

I have a MacIP solution, which runs on NetBSD and Linux, shouldn't be a problem to get going on Darwin.

So either using AsantePrint or LocalTalk bridge on another mac with both LocalTalk and Ethernet would solve that.

#39 Thu, 7 Aug 2008 - 22:30

expansion (maybe via some sort of plug-in deal). / third-party plug-ins would allow the user to customize the browser within the confines of their Mac's limitations and it might also be a way to get more people working on the project.

Sounds a bit like an open-API open source Cyberdog

#40 Thu, 7 Aug 2008 - 22:59

Another option is to setup a machine that will host PPP or SLIP connections.

But there is plenty of ethernet hardware for system 6 Macs. The most obvious one is a NuBus card in a modular Mac, but I'm fairly certain that a few of the SCSI to ethernet bridges worked under System 6.

At any rate, even with a compact you're better off than I. I don't own a Mac anymore. Still, I'm talking about what I wished for back in the day.

#41 Thu, 7 Aug 2008 - 23:00

Plugins on a 68000 macintosh running 6.0.8 are really crippled. You either do code resources with their limitations or else require the use of ASLM.

The only architecture that would have made sense is the CTB, but CTB works from terminals and streams and this is stretching things.

For example you could make a web-page renderer as a terminal tool, hence to change browser/renderer you change terminal tool. But this doesn't map well for clicking on links or button pushing.

I suggest a clean open source minimal browser, do one job and do it well.

MacTCP at one end, application in the middle, QuickDraw on the other.

#42 Thu, 7 Aug 2008 - 23:07

I have a MacIP solution, which runs on NetBSD and Linux, shouldn't be a problem to get going on Darwin.
So either using AsantePrint or LocalTalk bridge on another mac with both LocalTalk and Ethernet would solve that.

This gives me an idea that's... off-topic

#43 Sat, 9 Aug 2008 - 18:29

Whoever makes a System6 browser shall be the first beneficiary in my will.

Of the features you mention, I'd like ta see ....

* Image Support

* Can DownLoad Files

In my completely layman's opinion I think having different fonts would overly tax an already "struggling", if you will, piece of software -- lol -- and

just using either Chicago or Palatino would be fine as far as fonts go and would be that much less for the browser to hafta do.

:b&w:

#44 Sat, 9 Aug 2008 - 20:21

In my completely layman's opinion I think having different fonts would overly tax an already "struggling", if you will, piece of software...

I suspect image support would be much more of a resource issue than fonts (assuming the browser stick to one font each for serif, sans-serif and monospaced)...

[also a layman's opiniion]

#45 Sat, 9 Aug 2008 - 20:43

It would be best to avoid the images as

(a) each graphics requires another HTTP transaction to download and will look crap on 512x384 monochrome, especially when most of the web is assuming minimum of colour with 800x600.

( B) fonts should be used, but limited to supporting the headers "h1", "h2" etc originally defined levels, not the custom font stuff.

#46 Sat, 9 Aug 2008 - 21:49

With respect to fonts, those should be handled by the Macintosh's Toolbox. So, relatively speaking, they would be easier to implement.

With respect to image support, most images are compressed. That would add a definite load to the CPU. Since a lot of pages have a lot of images, the browser would probably be unbearably slow if the images were rendered inline. (I seem to recall a discussion a number of years back where a moderate sized JPEG would take minutes to render on an Apple IIgs. Now the CPU on a Mac is considerably faster, but not that much faster! Also, modern compression schemes usually place the compression ratio ahead of decompression time, since the CPU is cheaper than the bandwidth. So I'm *guessing* that formats like PNG would be slower to decompress than JPEG.)

There is also the issue of library support. Even if you could obtain a library that decompresses JPEG files under System 6, it may not handle the most recent variations. I doubt that anyone has ported a library for stuff like PNG to System 6.

So I'm afraid that image support will be fairly challenging.

Probably the best thing to do is look at what word processors of the era were able to render, and how well and how quickly they did it. Factor in stuff like HTML being harder to parse than a binary format, particularly for layout (content shouldn't be too bad), and embedded data being more complex (e.g. images). Then factor in your own programming skills. That will give you a rough idea of what capabilities should or shouldn't be included.

#47 Sat, 9 Aug 2008 - 21:54

With respect to fonts, those should be handled by the Macintosh's Toolbox. So, relatively speaking, they would be easier to implement.

There are two parts to the font issue...

The first is, as originally defined HTML defined abstract font typing for body and header, it was left to the renderer to define what looked best.

Later versions of HTML have pages specifying actual font types by name, which did not necessarily map well to what was installed on the system.

#48 Sun, 10 Aug 2008 - 00:55

True. I'm inclined to say go with the former and neglect the latter. Then again, I'm the type of person who believes that document structure is more important than the eye candy.

The other option is to create a list of font mappings. Back in my naughty days I'd say something like use Arial, but have Helvetica as a backup. Now System 6 doesn't have either by default, but it can recognise both fonts as proportional, sans serif then select something appropriate. Of course it won't cover the case when a dork of a website designer decided not to have a common font as a backup, but c'est le vie.

#49 Sun, 10 Aug 2008 - 04:39

Well, I just looked at this page with page style turned off ... it's not all that different, just a lot plainer and a bit messier. It gives me an idea of what I could perhaps expect from a very basic System 6 browser. Party like it's 1993!

Click to expand...

Try looking at it from Lynx... it's fairly horrible, but can be used it you're determined. I think that's a better approximation of a minimal System 6 browser.

Links is a great browser, if that is made for 68k, it would be perfect.

#50 Sun, 10 Aug 2008 - 23:01

Okay, I need some guidance with some HTML parsing...

I am very happy about parsing XML, the rules are very consistent.

However HTML is another pile of poo when people shove javascript in it which contains < and > characters.

Surely these should have been escaped with "&lt" etc?

Eg

Code:

.....

My problem is I see the "

What is the correct way to detect the end of java script in a manner that does not require me to parse javascript?

I basically want to end up with a TEXT node with the javascript under an ELEMENT named "script".

#51 Sun, 10 Aug 2008 - 23:16

1- Doesn't bomb when you quit

2- Plain text layout

3- Can download files

4- Forms

5- Beyond plain text layout (ie, bold, font sizes, etc)

6- Image Support

#52 Mon, 11 Aug 2008 - 04:52

1- Doesn't bomb when you quit 2- Plain text layout

3- Can download files

4- Forms

5- Beyond plain text layout (ie, bold, font sizes, etc)

6- Image Support

I'm putting something together that will do everything except 6.

My excuse being I am targetting monochrome 512x384, so 99% of todays embedded images will look awful.

The image support will be supported by item 3. You will be able to happily download images as files (or you could scowl while doing it).

Thinks, would need a "Content-Type" to TYPE/CREATOR mapper.

#53 Mon, 11 Aug 2008 - 05:42

My excuse being I am targetting monochrome 512x384, so 99% of todays embedded images will look awful.

Of course, I gave other reasons for not supporting images. But I just had to comment that I think that colour images resampled as 1 bpp looks pretty cool on the Mac. I forget the name of the algorithm that Apple used, but it does a good job. (Particularly compared to dithering algorithms that I encountered in my 386/486 days, which used a regular rectangular arrangement for the dithered pixels.)

#54 Mon, 11 Aug 2008 - 19:18

There are many halftoning algorithms, ranging from simple-but-crummy, to complex-but-less-crummy. Floyd-Steinberg seems to be very popular on the Mac (and elsewhere), as it seems to balance complexity with quality in some reasonable fashion. That's probably the algorithm you're thinking of.

#55 Mon, 11 Aug 2008 - 21:44

That Floyd-Steinberg name sounds familiar. And I think it's pretty cool how a black and white (not even monochrome!) image can look so good. It ranks right up there with those old teletype reproductions of photographs (i.e. the ones with over-typed letters).

#56 Mon, 11 Aug 2008 - 22:02

I remember the halftoned tiger image that lived on one of the early Hypercard stack examples. It rendered quite nicely on a compact Mac screen, as well as on the early Imagewriters. Now why can't the Wall Street Journal do as well?

#57 Tue, 12 Aug 2008 - 16:26

Okay whizzes, what am I supposed to do with the HTML garbage?

Code:

....	
	{for blog_item in blogs}

......

returned from http://www.guardian.co.uk

I'm discarding comments and scripts, but how is this supposed to be handled? All I can think of is if you read an element and after the element name you find something that is not

/

>

whitespace

then junk everything until the > character.

Ideas?

#58 Tue, 12 Aug 2008 - 21:00

That seems logical to me. Offhand I can't think of a situation where that would fail.

#59 Tue, 12 Aug 2008 - 21:20

How are you proposing to handle verbose sites within 1MB RAM? A lot of bloggers, just picking on them as an example, have pages which contain greater than 1MB of text data. By parsing that text, you can immediately throw away a lot of the advanced headers, CSS related data etc but you are still going to end up with a lot of data to buffer, process and finally present.

#60 Tue, 12 Aug 2008 - 21:54

How are you proposing to handle verbose sites within 1MB RAM?

Cry?

Or, you avoid buffering all of the data from the web page. Something like

can be regarded as "switch into header mode". You stick a shorthand (perhaps one byte) into a processed data buffer. You automatically reduce the token to a fraction of it's original size, a quarter in my example, but much less if there is other information added to that
token such as styles. (Sorry, but I forget my HTML terminology.)

I'm suggesting that one byte control codes will work since I'm assuming an ASCII character set. You have at least 128 unused characters due to the high bit (ASCII is only 7-bits), and a few more non-displaying control characters in the low 7-bits. If the processed page is still larger than the available memory, you buffer it to disk, but I doubt that will happen often since few web pages contain more than a few kilobytes of readable content.

But yeah, unfortunately you do need some sort of buffer because you need that for screen redraws. Still, I don't see it having a large impact on performance as long as you have a few screens worth of buffered data in RAM (both before and after the content that is currently being displayed). After all, the user will probably be spending most of their time reading. You simple grab data from the disk buffer when they are doing that reading.