Note: This content is accessible to all versions of every browser. However, this browser may not support basic Web standards, preventing the display of our site's design details. We support the mission of the Web Standards Project in the campaign encouraging users to upgrade their browsers.

Tobi Waves


INDEX | NOW | 2003|2004|2005 / 02|03|04|05|08|09|12 / 12

Wilbur, Alice and a Friend of mine

Tuesday, December 09, 2003 22:06 // Arthouse Alba, Zurich, Switzerland // href

The other night I went to see "Wilbur wants to kill himself" (xrl.us ...) a movie about a guy, his brother, a women kid living in some Scottish town. The guy has a book shop, falls in love with the women and the brother tried to kill himself. The movie is by Lone Scherfig a Danish writer/director who was quite successful with the movie Italian for Beginners. I loved the story and the acting and above all the Scottish accent.

The strangest thing happened to me a few minutes into the film. When Alice (the women) entered the scene for the first time. She looks exactly like a friend of mine. I am quite sure it was not here, because she does not speak with a Scottish accent, and she is not an actress. Never the less I kept comparing her and Alice all through out the film. How would she act in a similar situation? And just to make it more complex, Alice is just a role, played by Shirley Henderson (xrl.us ...) but written by Sherfig, so how can it have any bearing on my friends behavior, or do looks influence how people behave, did Sherfig write the part of Alice for Shirley? Are Alice and my friend twins separated a birth?

Well it was a wired experience, but definitely a great movie. Go watch it!

 

The West Wing and reality

Tuesday, December 09, 2003 22:36 // Aarweg, Olten, Switzerland // href

I admit, I am a soap opera junkie. I just love watching them. My current favorite is The West Wing written by Aaron Sorkin (en2.wikipedia.org ...). Tonight I made it through the final episode of the first season. This time the outlook is great, as the show is till running in the US, now in its fifth season.

The West Wing is about people working in the White House for the President. It's also about the President himself. President Bartlet, played by Martin Sheen is a liberal Democrats dream. What a contrast to the current reality. An American friend told me recently that she finds it rather disturbing to watch the show, seeing what the reality could also be. Michael Moor even nominates Bartlet for President, along with Oprah in his latest book.

I am watching the West Wing on DVD, there was also the 15 minutes Making Of feature on one of the disk. There the actors talk about their parts and some bits of the sets are shown. Hearing the actors talk was odd. They were so different from the roles they play in the show. Not only did they talk differently, but also their body language was altered. I have never noticed something like this before in such a show. The characters in the West Wing make a very authentic impression on me, much more than the actors themselves actually. At least as far as the story line and the dialogs are concerned this has to be credited to Aaron Sorkin. Maybe Michael Moor should be nominating Aron Sorkin along with Oprah and Bartlet, someone has to write their lines after all.

 

Finding Needles in a TB Haystack.

Monday, February 02, 2004 10:20 // Audi Max, ETH Zurich, Switzerland // href

A Talk by Urs Hölzel, Vice President for Technology,

About Google

Mission: TO organize the worlds information and make it universally accessible and useful,

An international company: 250% traffic from outside US

Engine has 4 Billion pages in index

Profitable since q1/2001

23 Office Location Worldwide.

15k boxes, several TB disk storage

There are over 1000 queries a seconds on dec 25th, 2am.

Engineering Offices in the US, Zurich and Bangalore

About the Web

Static web 167 TB in 11 Giga Pages, but dynamic websize 92 PB. (Estimates)

1 in 4 hosts on the net run a webserver.

Problem: All data, users, hosts grow exponentially. This means the problem of finding useful information grows exponentially too which makes for interesting problems.

Google Infrastructure

A high reliable system based on low cost comodity hardware. Redundancy has to be built into the software and hardware. Monitoring, repair and maintain these boxes is a prime problem.

The Google Filesystem GFS

Stripe files across many boxes and replicate them on multiple servers.

Components: Master - keeps directory and plans file layout, ChunkServer - hold the data. Clients - use the data. (Chunksize is 64 MB. Data is cached on client once retrieved. SOSP'03 (www.cs.rochester.edu ...) )

10+ Clusters of 1000+ boxes.

350 TB Filesystem

How to be a Search Engine

Crawling: Recursive Process. Problem: dynamic pages, slow servers, management of the link list, session ids in the URL, how to prioritize the URLs, being nice to the web servers, detection of duplicates, avoiding traps, actively fill forms to pull "hidden" contents, figure out when the page needs to be re-crawled.

Indexing: Words by document and position in the document. One Terra Words in the index.

Ranking: Hard problem. All traditional assumptions on searching like long, coherent, high quality documents are not valid for web documents. Googles idea is to define a PageRank for figuring the importance of the page. The PageRank of a page is the sum of PageRanks of other pages pointing to this page. A page contributes its PageRank divided by number of out-links to each of its target pages. In reality it is more complex. Google has about 100 factors in its real PageRank function like font size, color, proximity to other words.

Serving: Partition the data to different servers and have each solve a sub problem of each query. Query goes to Google Webserver, it queries Index Farm, accesses the Doc Farm for the real data. Additional services from Add Server and Spelling Server. IEEE Micro, 2003 has more on the structure (www.computer.org ...) .

Advertising: Find the best add, relevant to the query. This is a very important problem as this is the main source of revenue. Only show an add which has a chance to be clicked on, if the click-through is low, the add will be dropped. Advertisers only pay for adds actually clicked.

Google Playground

There is lots of data and computing infrastructure at Google. Google pays people who spend their time on figuring new ways to analyze and present this data: (labs.google.co ...)

 

Human Transmittable Computer Virus

Monday, March 01, 2004 22:43 // The Internet // href

In the good old times, when men were still men and computer virus writers were still technically brilliant hackers. Viruses used the uncountable holes in Microsofts ubiquitous Outlook eMail software to spread.

But even then, generally the rule was simple. If you don't want to be infected, don't run any code you don't know where it's coming from. If you have Outlook, make sure it's patched and properly configured. In Unix circles, people made fun of the whole situation by sending out mails which claimed to be a solidarity Virus, calling upon the Unix User to copy this mail to all addresses in his address book, to emulate a virus, as a gesture of solidarity with the Windows crowd.

Now, a few years later, Outlook has matured to the point that there have been no major holes for months. Never the less, eMail viruses still crop up and spread. Virus writer started to attack the users mind directly, by writing messages into the body of the virus email with the purpose of confusing the user into clicking the virus attachment, forgetting all the good advice they got. Fortunately the anti-virus software gets updated so quickly, that viruses are normally contained quickly.

Today though, mark the date, the whole matter entered an all new stage. I got the first virus which was contained in a password protected zip file. The password was contained in the accompanying email, so it is easy for a human to unpack, but anti virus software has no chance as it can not decrypt the zip file containing the virus. As a concept this sounds fine, but what totally kills me is that it seems to work. Since this morning, I get an increasing number of encrypted zip file viruses. There must actually be people who get this virus, unzip it using the supplied password and then run the thing in order to get infected.

I wonder how many people would hang themselves if they got a rope in the mail. Warden make sure all the cells are locked.

 

Postfix for Spamprotection

Saturday, March 06, 2004 10:05 // LinuxFormum 2004, Symbion, Copenhagen, Denmark // href

by Ralf Hildebrandt

How to use Postfix as a crude but cheap filter against spam in front of the more complex filters like spamassassin.

Sources of Spam

An important source of spam these days are miss-configured web proxies which proxy to smtp ports as well and let outsiders connect.

Protection

Use RBL lists for open-proxies, open-relays,

Reject mail from faked sender address (see below).

Insist on RFC conformance (this can make you loose lots of real mail to as there are many missconfigured normal mailservers.

Content Filters: Altermime, SpamAssassin

On Postfix

Use the snapshot version of Postfix as it is realy stable and has all the latest features.

Use a cashing nameserver to speed-up dns lookups.

By default postfix is configured to only accept mail from your local network for external destinations. This has no influence on spam though.

Be careful choosing RBLs because there are many badly maintained blacklists out there. Blacklist must have clear criteria and a delisting procedure.

postmaster@yourdomain and abuse@yourdomain must accept all mail this must be explicitly listed in smtpd_recipient_restrictions.

A good RBL list cbl.abusenet.org recomended by Ralf.

When you are using RBLs make sure that you can quickly add exceptions to your system.

Rejecting mail to unknown users at the smtpd stage is very efficient as it first saves traffic and it also saves you from sending bounces.

Postfix can use various directory services to figure out which users exist. Postfix 2.1 will even cache answers

Use right hand sender black lists may also help. But be care full. Look at =dsn.rfc-ignorant.org, postmaster.rfc-ignorant.org, abuse.rfc-ignorant.org, whois.rfc-ignorant.org=.

RBL/RHSBL are expensive because of all the DNS lookups. Perform them as late in the restrictions list after the cheep mails.

Sender address verification

Check if the sender is either a known valid or can be verified to be valid. Postfix has special support for this as it can send test messages to the sending host. The sender will not notice this as postfix only starts sending mail but aborts before giving any message body.

Make sure you are really careful as this can cause you to loose mail from people who are not able to correctly spell their sender. One option is to apply these sender check restrictions only to suspected domains.

 

SMTP Authentication, and certificates based relaying.

Saturday, March 06, 2004 11:17 // Symbion, Copenhagen, Denmark // href

by Patrick Koetter

How to support mobile users to use your server as a mail relay. IP based restrictions do not work as the mobile users will have random IP addresses.

SMTP AUTH

Using Cyrus SASL2 and OpenSSL together with Postfix. You can configure postfix such that it allows relaying access for users who are properly authenticated. Most mail clients support snmtp authentication.

The problematic thing is to properly configure SASL. Get the CVS version as it is less buggy then the official 2.1.17, it even has some minimal documentation.

SASL configuration is governed by a config file called the same as the program using the sasl library. In our case this is smtpd.conf.

If you use SASL with plaintext passwords, make sure it only allows AUTH when TLS is in operation.

Check out Patricks howto on this (postfix.state-of-mind.de ...)

Certificate based Relaying

For people running mobile Unix it is possible to setup a local mailserver which just forwards all mail to the official mailserver of your site. By configuring the postfix smtp daemon to use TLS on the client, and you store the clients cert on the server. Now configure the server to ask clients for a certificate when they connect. If a client submits a vlid (known) certificate it will be allowed to relay even if it has an ip number outside the local network.

The cool thing about this is, that now any program on the mobile unix client can send mail via the local mail server to the company mailserver without further problem.

 

NEWER | LONGER |