Note: This content is accessible to all versions of every browser. However, this browser may not support basic Web standards, preventing the display of our site's design details. We support the mission of the Web Standards Project in the campaign encouraging users to upgrade their browsers.
Monday, September 27, 2004 09:00 // Sane 2004, RAI, Amsterdam, Netherlands // href
A tutorial by Joost van Dijk.
About the IPv6 Header
It is much simpler than the IPv4 header. It has a fixed length of 40 Bytes.
Rule 1: Make the frequent case fast
Complex things like fragmentation are handled with extension headers. Because IPv6 routers do not do fragmentation anyway they can do with just looking at the base header. Fragmentation happens in the sending host. The sender uses Path MTU discovery.
With the flow label in the header, The router can see which packets belong together without looking into the packets themselves.
Extension headers can be inserted between the IPv6 header and the payload data. These extension headers come in a predefined order. This order again ensures that the routers only have to look at the first few extension headers because only they can contain information relevant to a router.
Each header has a field called next header which defines the type of the next header.
Implementations
Most current OSes and routers support IPv6. Since Windows XP SP2 there is a production quality IPv6 implementation for Windows as well. The first prototypes were available as a separate downloads from MS.
Enabling it on Windows XP SP2
netsh interface ipv6 install
IPv6 addressing
The first few bits of an IPv6 address define the type of the address. Every IPv6 interface has a link local address (private local address space).
Addresses are written as 8 colon separated tuples of 4 hex digits.
One sequence of zeros can be abbreviated to a double ::
2001:0000:0000:0000:0000:3233:da33:3ad3 becomes 2001::3233:da33:3ad3
The loop back address is ::1
Global Unicast Addresses are built like this
001 - 3 bit Top Level Aggregator (eg. RIPE) - 13 Bit Reserved (these bits could be added the TLA or NLA field in the future) - 8 Bit Next Level Aggregator (ISP) - 24 Bit Site Level Aggregator (Subnet) - 16 Bit Interface Address - 64 Bit
RFC 3587 obsoleted this format recently. In the future, the toplevel registrars will decide where the borders are.
Because of the hierarchical nature of addressing the routing tables will become much shorter for IPv6 routing.
Currently only 3 TLAs are defined. Today most new addresses are from the 2001:: Sub-TLA Assignment range. There you get 13 bit sub TLA and 19 bit NLA.
The 48 bit Ethernet addresses can be mapped to the 64 bit IPv6 interface address: (first 24 bit, FFFE, last 24 bit) this is not required though. You can use a random number. Just make sure you get no duplicates.
Addresses starting with 0 ending with an IPv4 address can be used for automatic tunneling.
Current versions of the host dns lookup tool will find IPv6 addresses and it will do reverse lookups automatically when given a numeric address.
Multicast addresses
They start with "FF" there are some well known addresses like
ff02::1 - all nodes on the link
ff02::2 - all routers on the link
ff05::1:3 - All DHCP server at this site
There is a special entry in the routing table for multicast FF00::/8.
ping -c 2 -I eth0 ff02::1 will find all hosts on the local link.
In IPv6 there is a new version of the ICMP protocol (known from ping) it is now also used for ARP and multi cast group membership management.
Get the IPv6 routing table on Linux
route -A inet6
or use the shortcut notation
route -6
Getting an IPv6 address in IPv4 land through tunneling
On way, is to use 6to4. (tldp.org ...) Note that the gateway 192.88.99.1 is a global any-cast address which will automatically go to the closest IPv6 gateway. The 6to4 approach requires you to have a public IPv4 address on your machine or a NAT gateway which can do protocol 41 NAT (protocol 41 is used for 6to4 tunneling)
The new Teredo protocol allows even boxes behind a NAT gateway to get connected to IPv6. Windows XP SP2 has this feature built in. On Linux there is an implementation called miredo which can do the same.
Tuesday, September 28, 2004 09:05 // SANE 2004, RAI, Amsterdam // href
A tutorial by Gerald Carter.
General Things
The big accomplishment of the Samba team is, that they document stuff which MS does not document.
In October 2004, support for Samba 2.x will be dropped.
The configuration parameters parsing and the autoconf files in samba 3 are larger than the whole samba distribution of 13 years ago.
Samba 4 is a complete rewrite from ground up. Don't wait for it!
Samba 3.2 will get backports of some 4.0 features like their RPC code. Better ACL support. Make sure Samba servers look even more like Windows Servers.
There are about 3 people working on Samba 3.x and 3 people working on Samba 4.
A further goal for Samba 4 is to make CIFS protocol as a viable alternative NFS. Unix extensions are being worked on to workout the wrinkles with non unified UIDs.
Samba 3 tibits
The big underestimated tool in samba 3 is net. It is similar to its windows namesake. Unfortunately there is not much documentation on it. But if you start it without parameters it will tell you what it does.
If you run samba without netbios support and you want to use several different configurations on the same server you can add virtual interfaces and then use the %I option for loading different configurations depending on the interface the client connected to.
Samba always tells its version number. This is not a security issue, because if knowing the samba version allows someone to hack into a samba server, this means that there is a bug in samba which needs to be fixed.
Per-service parameters, set in the [global] section will become the default for all services which do not set the parameter explicitly
To reduce the load on your samba server, use the deadtime option in the [global] section. It is set to 0 by default. If you set it to 15 samba will kill seemingly dead connections (happens a lot with print clients) after 15 minutes without negative effects on the client side in general.
In the samba config file you can access environment variables using the %$(ENVVAR) syntax.
SWAT the samba administration GUI will probably be integrated into samba by letting smbd execute swat for connections on port 901.
Windows will not show any shares ending in $. This is only cosmetic though, it does not prevent connections to the share. Using the 'browseable' setting may make more sense as this will prevent listing of the share from the server side (still no security, but you are free to choose the name).
Configuring samba for guest access
[global] map to guest = bad user guest ok = yes username map = /file ...
And make /file contain
# map everyone to an invalid share foo = *
Samba Authentication
Windows uses a challenge response system when authenticating users. This requires both ends to share a common secret. Windows does not store plain text passwords, it does encrypt them, but there is no salt in lanman hashes (windows encrypted passwords). Even worse due to the challenge response system, anyone who is able to get a copy of a encrypted password can use this with a properly hacked smbclient to access the corresponding windows account. Lanman v2 hashes added some measures to prevent 'man in the middle' attacks, but the base problem remains. This means you have to be much more careful to prevent 3rd parties from accessing encrypted windows passwords as they do not even have to be cracked before they can be used.
There is also a positive side to this, because due to the challenge response approach, a hostile (hacked) server will not be able collect passwords from users trying to log on.
Samba can use multiple passdb backends. If several passdb backends are defined in smb.conf, samba will search all backends. If a password gets changed, samba will change it in the passdb backend where the password came from. If a new user is added it gets added to the first passdb backend defined in passdb backend
For storing additional information per user, use at least the tdbsam backend. The text based smbpasswd can only store the most basic information.
Quote: LDAP is not that difficult, but the problem is that people try to walk before they crawl.
Samba needs a Unix account for every user.
Note that smbpasswd does not allow entering the password on the commandline anymore, but it can take input from stdin now:
(echo pass;echo pass)|smbpasswd -a user -s
Access
If users have problems with the fact that they can connect to other users home directories, put the following in your [homes] share.
[homes] valid users = %S
Instead of using complex mask settings for files and directories, you can set the inherit permissions parameter and manage the permissions on the Unix directory level. This allows to have only one group share with different access permissions down the tree.
Share-level ACLs are done internally in samba, so they do not require any filesystem acl support.
MS-DFS
With MS-DFS, a server can send a transparent referral to a client so that it queries a different server. To make it work the client password must work on both servers.
In smb.conf:
[global] host msdfs = yes [dfs] msdfs root = yes path = /export/dfs
In /export/dfs do:
ln -s 'msdfs:server1\share1,server2\share2,...' directory
This will cause requests for \\server\dfs to be transparently redirected to \\server1\share1 or \\server2\share2 is the first one is missing.
Smaba can even do DFS proxies. In smb.conf on sever1 do:
[proxyshare] msdfs proxy = \\server2\anothershare
Printing
On a printer share you can define how much space must be left (in kb) before a new job is accepted:
min print space = 5000
In RPC based printing the %c value contains the number of pages to print.
If using samba as a printer server, you may want to be able to define the default configuration data which is installed together with a printer driver. For this install one printer (lets call it seedprinter) with the driver you are interested in, and change the printer defaults from windows and then call rpcclient with the magic setprinterdata value _p_f_a_n_t_0_m_ this will copy the printer configuration data of 'seedprinter' as the default for all printers who are using the same printer driver as well as for any new printer which is associated with this driver.
rpcclient -U printadmin -c "setprinterdata seedprinter _p_f_a_n_t_0_m_ xxx" server
The xxx argument is ignored, so use just any string ...
The caveat about this is, that when we tried it during the talk it did not work.
NetBIOS
Samba 3 works fine with netbios disabled. Just don't start nmbd, make sure all your servers are in DNS and use the following in your smb.conf file:
[global] ... name resolve order = host disable netbios = yes
Several Samba Servers using the same authentication source
To have several Samba servers authenticate against the same user database you can setup one samba server as a PDC and make the other Samba instances into clients of the samba server. Make sure you do not provide winbind with a user id or group id mapping range config so that it falls back to using the user and group ids provided by the Unix host.
Windows Integration
When storing user profiles on samba you may want to use the patch %H/.winprofile/%a as logon path this will store the users profile on a 'per windows release' basis. Note the logon path is not your home!
A PDC requires a machine trust account for each host who is using it. These accounts get created when a machine joins the domain. This means that samba must have appropriate scripts defined to be able to run these scripts, machines must join using the 'root' account of the samba server. This means you need a samba password for root, and the whole setup may make you feel rather edgy :-). The samba folks are working on this.
If you ever want to migrate a Windows NT4 PDC to a Samba domain controller the command net rpc vampire is your friend as it will suck all the account information out of an existing PDC. This relies on the availability of the scripts mentioned in the previous paragraph.
Wednesday, September 29, 2004 09:25 // Sane 2004, RAI, Amsterdam, Nederlands // href
A tutuorial by Tom Limoncelli
This is the second time I attended this tutorial. This time I got the first half, Check the (people.ee.ethz.ch ...) entry.
New Hire Process
Draw up a check list for new hires.
Let new users choose their username.
Visit the new hire on their first day for a short chat about the system.
Give them a 2 page handout with the most important things new users need to know.
Show them how to print.
Show them how to access/install software.
Show them how to get help.
Do a follow-up visit in their the new users second week.
Things people expect to be fast
There are some support requests, people expect to be handled quickly. If it takes us a long time todo these things, our image will suffer badly. Identify the tasks that are supposed to be quick and make sure that they are.
Users for example expect reseting passwords or getting new IP addresses to be quick. Take this into account when deciding on what to work first. Resetting a password really does take very little time, so if you do it immediately, customers will be happy and you do not have to work any harder.
If something is put on hold, tell the user when he can expect the problem to be solved.
You should also look at the damage created by a problem persisting when deciding on the priority for dealing with it. Involve the customer in this decision as he may know more about the side effects of the problem.
Make sure you understand what problem the user has before starting to work. Some people do not report their problems but rather give instructions what support has todo for them. This can lead to interesting situations where there would be a perfectly simple solution to a problem but because the user never told you the problem and you did not ask, you start working on the complex solution to an unknown problem.
The visibility paradox
The best System Managers do not get recognized because everything works and people almost forget about them. So we have to become active to make sure people know that we are working for them.
Have a monthly meeting with the leader of each group you are working for. These meetings can be very short ("30 minutes are enough"). Let them talk and mention your things in passing.
Be physically visible. Have stickers on the computers about how to contact support.
Make sure the office layout lets customers see the people first they are supposed to talk to (front line support) when they walk into your space.
Have a yearly town hall meeting with all your users. Have a lecture on a current topic and then have questions and answers. Don't be afraid of unhappy users who might complain in public, this is much better than people complaining about your behind your back.
When spam^H^H^H^Hmailing all your users ... Make sure that grammar and spelling are correct, keep extra short, have the important information first. Create a useful subject. People will NOT even start to read the mail when they do not see a reason to do so in the subject.
Customer satisfaction
Users will resist answering complex questionnaires. But if you send a short evaluation mail whenever a request ticket gets closed, you may get better response: One question, three possible answers Happy, Indifferent, Unhappy, with links to paste into the browser.
Helpdesk Scope
How about this? We know which things we are responsible for, and for all the other things people bring to us, we know where to refer them to.
Infinite scope but clearly defined responsibility.
If you walk up to a computer with unsupported hardware. Tell the user that this HW was not supported, but that you were allowed to work for 30 minutes on the problem. Then try to fix it for 60(!) minutes and then if you are not successful, tell the user that they would have to buy a supported card.
Continued in (people.ee.ethz.ch ...)
Thursday, September 30, 2004 09:40 // SANE 2004, RAI, Amsterdam // href
A Keynote by Paul Kilmartin
eBay is the worlds 68th largest economy right behind Kenia.
WalMart took 41 years to grow to somewhat over 100 billion dollar. MS took 19 years to grow over 30 billion, eBay took 8 years to reach 3 billion. Managing growth is a challenge.
eBay started on the first Monday in September 1995 (labour day) out of eBays founders bedroom on a 30$ account with a local ISP. In January 1997 eBay had 4 pcs, at a co-location facility with 10 Mb/s. At that time it was supporting 200'000 auctions. Growth continued in 1999 they had over 200 servers and over 200 Mb/s dedicated bandwidth.
In 1999 things stared getting really serious as eBay had a 22h outage due to its hap-hazard IT architecture not keeping up with the rapid growth. The first attempt to get to grips with this was to throw more hardware and clustered computers at it. This did get them some headroom, but things only started to really get better when they started to split the application out into more individual clusters.
The base idea was to restructure the whole architecture so that one server failing should not take down the whole site. This was implemented by having the same code base on all servers, but running only part of it on any particular machine.
Special precautions had to be put in to prevent clever users from manually changing URLs into executing parts of the ebay functionality on the 'wrong' machine.
By January 2001 eBay another 11 hour outage happened as the whole dual attached SCSI setup for storage went south. At this point a new IT strategy was established at eBay: It is not possible to find 'the ultimate' setup. Everything we setup has a maximum life of 3 years. We do not only need good solutions to our current problems but we also have to plan for its replacement with the next better thing.
Today: 100 Database Instances on Sun 440's and 6800's all in VCS auto failover. 7800 Mb/s bandwidth. Search and Listings on 1100 CPUs on 127 Servers, Mail on 271 Server, ...
Since the 2001 outage the average availability has been around 99.9% ...
Quotes:*
Most support organizations are more interested in closing tickets than fixing problems.
Using market leaders means other people find problems before we do and fixes are introduced quickly.
We are a live entity with millions of active customers, not a cadaver to be dissected. Vendors have to bring fixes for problems not experimental solutions. (Running Sun Explorer is forbidden on production machines.)
We can't set up cameras on the grassy knoll to record the next Kennedy assassination.
Thursday, September 30, 2004 11:08 // SANE 2004, RAI, Amsterdam // href
A presentation by Luca "ntop" Deri
With todays fast networks packet capture has become a problem.
The problem is two fold, first the packets have to be captured, and second there must be spare cpu cycles for analyzing the packets.
The tool of the trade is the pcap library which provides a unified interface to packet capture. It presents the same API on every OS while it is highly customized on the machine side.
The problem is that pcap performance is not very performant. Especially in problem cases when denial of service attacks happen pcap is not able to capture the traffic because it does not deliver the necessary performance.
There are specialized cards available for packet capture but they do not have pulic APIs and are very expensive.
Lucas tests have shown that normal Linux packet capture is really the worst among Linux (0.2%) , FreeBSD (34%) and Windows (68%).
There have been various attempts to improve this, but even the best solutions were only able to capture (11.7%) of the traffic in the worst case.
In essence this mean using Linux for packet capture is a very bad idea. FreeBSD is better by almost a magnitude.
Lucas idea was to create a special package capture architecture for Linux by providing a new socket type Socket Packet Ring (PF_RING). PF_RING provides a ring buffer in memory for each socket. The application can then read from the ring buffer with mmap. The socket has facilities to record the fact that it had to overwrite a packet in the ring before it was read. This decouples the application from the kernel and improves performance substantially. This implementation is network driver neutral and quite fast (47%) still a bit slower than FreeBSD and still over 50% of the traffic lost in the worst case.
The interesting thing at that point was that the CPU was running at 30% in that situation and loosing 50% of the traffic. Luca found that disabling and enabling the interrupts in the kernel were preventing it from going into packet capturing mode fast enough. The rtirq patch was able to solve this last problem. Luca ended up with a System that was as fast as FreeBSD in capturing packets but with much more CPU to spare.
This solution is about twice as fast as commercial netflow capturing probe selling for much higher prices than the hardware cost for running Lucas solution.
Luca has been investigating ways to further improve performance and found that Gigabit Ethernet drivers on Linux could be programmed much more efficiently by exploiting the cards local packet buffers. A second issue is that Linux does memory allocation and de-allocation whenever it reads from the network card which takes a long time.
Luca has published his work in a project called nCap which provides a accelerated variant of libpcap that lets you recompile your old libpcap applications to reach much better speed.
In an attempt to further improve performance Luca has created a custom gigabit Ethernet card driver that programs the Ethernet card to make its traffic data available directly in the computers memory, freeing the CPU totally from this task, letting it work on traffic analysis exclusively. Luca calls this new method 'straight capture'. This method gives you traffic capture at device speed. With the limitation that only one application per card.
Thursday, September 30, 2004 12:03 // SANE 2004, RAY, Amsterdam, Nederlands // href
A talk by Wietse Venema
Wietse tells how he got into security in the early nineties at Eindhoven university, when trying to figuring out who was scanning their network, writing tcpwrapper in the process and learning lots about network programming.
In 1995 Wietse and Dan Farmer announced their automatic network scanner called SATAN in a white paper. For some people this announcement was introducing the death of the Internet. A lot of discussions happened about a sensible way to release such a dangerous software. Finally it was released on a limited basis first and then to every body at the same time. This was even featured on CNN. The whole excitement about the tool was totally overdrawn though, neither an increase nor a decrease of break-in activity was noted in the wake of the release.
This was just another episode in the ongoing debate over full disclosure or as a US Military put it: "If my systems have a problem, I would rather hear it from a friend."
Finally Wietse talked about his experience when writing PostFix. Work on PostFix started in 1996 out of frustration about the continued security problems of sendmail.
Quote: Some problems in software are found because and others are found because so many users use it.
Today PostFix is the standard in many Systems and building the cornerstone of many large organizations infrastructure.
The release of the PostFix mailer was touted by the IBM PR department in an article in the New York Times. This caused Lou Gersner (the CEO at that time) to start asking Questions about IBMs strategy in OpenSource. A year later IBM had fully embraced Linux from its smallest to its largest systems.
Quotes:
You can run from Windows but you can't escape from it. Suddenly you UNIX-based mail server because a major vehicle for email worms and other malware.
SPF is evil
Spammers don't destroy the infrastructure, it's the well-meaning people with poorly designed coutnermeasures.
On security of OpenSource vs. ClosedSource: You don't need source to find bugs.
The number of people who contribute source to PostFix I haven't got to spend much time fixing before integration is about two.
Security initiative are great, but only for new systems.
You can all layers of security around old systems, but these layers will also have bugs.
More on security (seclabs.cs.ucdavis.edu ...)
Thursday, September 30, 2004 16:04 // SANE 2004, RAI, Amsterdam, Nederlands // href
a presentation by Rüdiger Weis
My interpretation: TCG is basically a system for vendors who do not trust users. Even though the licenses which go with TCG enabled products with licenses that would not be valid in Europe to enforce it with hardware support. TCG 1.2 fixes some of the problems, but only some.
In principle a security infrastructure in every computer would be nice if it did not come with all the (evil) 3rd party interests and did not have all the problems still to be found in TCG 1.2.
The systems that prevent access to copyrighted songs can also be used to prevent any other documents. It would even be possible to prevent access to some documents at a later date by changing the appropriate access keys.
Imagine the copyright holder looses his keys, all copies of his work may become inaccessible.
The cryptographic foundations of TCG 1.2 stand on pretty week (new) legs cryptographically as many new methods are being employed in this system.
While TCG supports long keys of 2048 bits, it is still possible to use 512 bit keys and weak 160 bit sha-1 keys this raises questions regarding the cryptographic viability of this system.
Neat: https://www.trustedcomputing.org uses an invalid SSL certificate.
Bundesamt für Sicherheit in der Informationstechnik (www.bsi.bund.de ...)
Thursday, September 30, 2004 16:46 // SANE 2004, RAI, Amsterdam, Nederland // href
A presentation by Clifford Wolf
SubMaster is a system for distributed software development where every developer can develop on his own repository and then forward patches to a central repository. The patches received at the central repository go into a system from where their integration into the main repository can be controlled by the project maintainer using specialized tools.
With SubMaster you get a rich tool set for working in such an environment.
SubMaster uses SubVersion as its data repository but wraps it so that distributed development becomes possible.
SubMaster comes with two main scripts sm for maintaining local repositories and uploading patches and smap for applying the patches to the main repository.
Further information is available on (www.rocklinux.org ...)
Friday, October 01, 2004 09:34 // SANE 2004, RAI, Amsterdam, Nederlands // href
a presentation for Kris Buytaert
OpenMOSIX is a platform for HPC clustering.
An OpenMOSIX cluster is self organizing. It has no central node. New nodes are discovered as they join the cluster.
Each process is split in two parts, the frontend part which does all the IO and a backend part which does 'the math'. The frontend node stays where it was started. The backend will migrate to another node where it will be executed. Backend parts can continue to migrate at a later stage if the situation changes.
OpenMOSIX uses an economics model to figure out where a process should migrate to.
OpenMOSIX does not require any special libraries to run application. All the core functionality is in the kernel. There is a set of special tools for administration of the cluster. A lot of configuration information is in the /proc interface.
Applications do not have to be recompiled to run on an OpenMOSIX cluster.
For performance reasons, there is no security between the nodes of an OpenMOSIX cluster. So the setup has to be done on a private network.
OpenMOSIX does not do Batch queuing. It can be combined with condor for this to get the best from both worlds (excellent scheduling from condor and simple process migration and check-pointing from OpenMOSIX.
New Features: Migration of shared memory, port to 2.6, no more OpenMOSIX file system (there are enough cluster file-systems already like luster or gfs), migration of most features into user-space.
HOWTO: (www.faqs.org ...)
Homepage: (openmosix.sourceforge.net ...)
OpenMOSIX Knoppix (bofh.be ...)
Friday, October 01, 2004 10:23 // SANE 2004, RAI, Amsterdam, Nederlands // href
a presentation by Marco Pfatschbacher
Normally a load-balancer is a physical box sitting in front of the nodes doing the actual work. In this setup the load-balancer becomes a single point of failure. To have high reliability, we need a second load-balancer with fail-over.
Marco presented a method to setup a group of hosts with load sharing/balacing functionality. Instead of using a dedicated load-balancer, the worker nodes are sitting on the same Ethernet segment and each node receives all traffic and just consumes the traffic it is supposed to use.
In order to receive all traffic, all nodes setup a virtual interface with the same Ethernet address and ip number. Simple repeaters have no problems with this, but switches are normally not happy when they do see the same mac address on several ports. The trick to solve this problem, is to configure the physical interfaces to respond with proxy ARP responses telling the switch the Ethernet address of the virtual interface. This will make the switch to always flood the network with traffic destined for the IP address of the virtual interface.
The nodes now use a distributed filtering approach (nms.lcs.mit.edu ...) to decide for each incoming TCP connection which node is going to handle it.
High-Availability is implemented through a small daemon ifstated and CARP (www.newsforge.com ...) to redistribute incoming connections appropriately if one node becomes unavailable.
Known Limitations and further work: Load-Sharing is static and stateful packet filtering (PF) can not be used.
Friday, October 01, 2004 14:33 // SANE 2004, RAI, Amsterdam, Nederlands // href
A talk by Sjoera Nas
Bits of Freedom is an Dutch NGO funded by private parties. Their topics are privacy, freedom of speech, spam, e-voting and copyright. In September 2004 they did a test how simple it was to get Dutch ISPs to take down a web page which contains an obviously public domain text.
7 out of 10 providers acted swiftly by taking down the alleged violating document.
The full paper: (www.bof.nl ...)
Monday, August 01, 2005 08:21 // Portland, USA // href

taught by Dave Thomas
Programmers are like artists, they can only be successful if you have fun doing it. The programmer sitting in front of an empty editor buffer is like an artist in front if a blank canvas.
A good programmer picks the language appropriate for the problem.
Ruby as a language was created in 1994 by Yukihiro Matsumoto (Matz). He combined concepts from various other languages into a new language. Despite what one might expect, the new language is actually beautiful and coherent.
Ruby is similar to Perl in the sense that it does not force a programming paradigm on the user. It is rather a multi-paradigm language allowing procedural, object oriented, functional as well as meta programming.
About the language
All Ruby objects AND classes inherit from the default Class Object which has a default new method on the class that calls the default initialize method on the object. Each method can be overridden.
class Song
def initialize(a_title)
@title = a_title
end
attr_reader :title
attr_accessor :artist
end
a_song = Song.new("Hello")
a_song.title
a_sont.artist = "Sam"
The attr_* functions are meta programming elements, they create an attribute reader and an accessor method respectively.
def title @title end def artist=(val) @artist = val end
In Ruby you always have to use accessor methods to get to object variables. The advantage is that your code will always stay the same regardless if you actually do something when a variable is set, or just set it directly. The = in the method definition is part of the method name. So even though it looks like an assignments it is actually a method call.
Strings can contain embedded ruby code
"string#{arbitrary ruby code}string"
Methods can be called with blocks of code
def example
yield arg1,arg2
end
example {|var1,var2| puts var1 puts var2}
or
example do |var1,var2| puts var1 puts var2 end
The yield function will execute the block passed to the example call. In the block, arg1 and arg2 are accessible as var1 and var2.
Ruby does exception handling
def my_file_open(name)
f= File.Open(name)
yield f ensure f.close
end
my_file_open("file") do
|file| line = file.gets
puts line
end
this will make sure f.close gets called even when the block runs into problems and throws an exception.
Blocks can be nested ...
DBI.connect("DBI:Pg:my_db") do |db|
db.transaction do
db.execute("SELECT ..." do |stmt|
stmt.each do |row|
# process row
end
end
end
end
In Ruby, variables are untyped while objects are typed by the thing they do (method names). The means if an object has the right methods, you can use it as a replacement for another object. In Ruby this is called Duck Typing (If it walks and talks like a duck it might as well be duck). This helps for things like unit testing. The type of an object is what it can do.
Meta programming
Ruby classes and objects are "open". This means you can add new methods or overwrite existing ones.
class String
def encrypt
tr "a-z","b-za"
end
end
a = "cat"
puts a.encrypt
Adds a new encrypt method to rubies standard String class.
Languages should allow you todo cool things even when it would allow stupid people to horribly break everything.
Classes and Modules get executed at 'definition time'.
class Logger
if ENV['DEBUG']
def log(msg)
STDERR.puts "LOG: " + msg
end
else
def log(msg)
end
end
end
A module is a class that can not be instantiated (static method). Class methods in normal Classes have to be prefixed with the Class name to separate them from object methods.
module Dictionary
WORDS = {}
File.read("/usr/share/dict/words").split.each do |word|
WORDS[word]= true
end
def Dictionary.known_words?(word)
WORDS[word]
end
end
class Dave
def Dave.hello
puts "Hello"
end
Dave.hello
end
Method names can end in =, ! and ? in addition to the normal characters and numbers. By convention, = is for 'set' methods, ? is for test methods and ! is for dangerous methods.
Methods always get executed on an object (aka receiver) if no object is mentioned, then the default receiver 'self' will be used. Inside a class definition the default receiver is the current class.
class Dave
def Dave.hello
puts "Hello"
end
hello
end
Dave.hello
Subclasses inherit class and instance methods of their parents.
module ActiveRecords
class Base
def Base.set_table_name(new_name)
@table_name = new_name
end
end
end
class Book lt ActiveRecord::Base
set_table_name "volumes"
end
Note that @table_name is a variable of the class Base. This works because class Base is actually an object of the class Class.
Additional Ruby Stuff
A good IDE for ruby is Freeride
Rails THE is Ruby framework for web applications. (www.rubyonrails.org < ...)
Ruby has a database interface called DBI (equal to the one in perl) and it also has a much more powerful one called ActiveRecord that maps databases to classes, objects and methods.
class Book ltActiveRecord::Base end
will access the database "books" and create all sorts of useful methods for accessing the information inside the database. For dumb database like mysql which do not allow to define a internal consistency rules, you can use ActiveRecord to define restrictions on what the database should accept.
Why Ruby
Lightweight
Transparent, Ruby is obvious and easy to read
Portable
OpenSource (MIT, Artistic)
Easy to learn, things work the way you or rather Mats expects.
Ruby is fun.
Stable Language. The language has not changed much over the recent versions, most action happens in the extension libraries.
Resources
Websites (www.ruby-lang.org ...) and (www.rubygarden.org ...) and the newsgroup comp.lang.ruby
Ruby Programmer guide ... (www.pragmaticprogrammer.com ...) (1st edition is available for free online)
Monday, August 01, 2005 13:29 // USA, Portland // href
taught by Bill Dudney
About Eclipse
Eclipse is a rich-client-platform (rcp) that gives you a lot of functionality for building rich clients, this means that a lot of functionality is contained in the client. Eclips itself is such a "client".
Currently there are about 100 companies who donate resources to eclipse development.
The main focus of Eclipse is still Java, but there is a growing number of plugins for using Eclipse to write code in other languages.
Eclipse consists of 1000s of tools. A perspective is Eclipses way of only showing the tools required to do the task at hand. Eg the Java perspective only shows the tools relevant for Java development.
The Java editor is aware of the Java language and has lots of java specific functions like context help and command completion. It can even help to fix bugs by suggesting fixes to common errors.
Eclipse is fully integrated with JUnit. It can automatically generate JUnit test skeletons for any class you select. The Eclipse Java editor has lots of functions fo code generation. At the touch of a button it can generate all that fluffy code that does not do anything and still has to be repeated many times. (Ruby solves this problem with its META programming functionality in the language itself.)
Eclipse can deal with Ant (the java make alternative) files, but internally Eclipse has its own build system, so it does not keep Ant files in sync. NetBeans on the other hand, uses Ant internally as its build system so if you are into Ants, you may want to look an NetBeans.
Debugging Java with Eclipse
Eclipse can remote debug java applications. With this it is possible to fully separate the application from the eclipse environment. This comes especially handy with big java apps like Tomcat. Use options -Xdebug and -Xrunjdwp: to start the application with remote debugging hooks enabled (this is a special function of the Sun JDK, so it will only work when running with the Sun JVM.
The Eclipse debugger can be enhanced with custom java code (toString methods) to teach it how to represent custom types when printing variables.
J2EE development with Eclipse
If you are doing web application development you should get WTP (Web Tools Project) which provides full J2EE integration for Eclipse.
Installing WTP: First get GEF (Graph Editor Framework), EMF (Eclipse Modeling Framework) and Visual Editor Project and only then you can install WTP successfully.
Tuesday, August 02, 2005 08:32 // Portland, Oregon, USA // href
taught by Alex Russell
What is AJAX?
An acronym for Asynchronous JavaScripts AND XML
Ever wondered how Google does their cool new apps like (maps.google.co ...) or how something like (shared.snapgrid.com ...) works? There is no real standard yet but a keyword: AJAX. Wikipedia has an evolving page about it: (en.wikipedia.org ...)
The key to AJAX applications is that everything happens on the same page. No page reloading is required.
The browser turns from a dumb page renderer into a protocol client.
Where to use AJAX
AJAX is for interactive applications, for static documents, the current xhtml pages are perfect: Accessible, fast, ... only use AJAX when you can make the users lives better and not worse.
One of the big challenges is cross-browser compatibility. While MSDN and Mozilla provide good references for their browsers, they obviously do not talk all that much about compatibility problems. There is a good resource for such information on (www.quirksmode.org ...)
Getting XMLHttpRequest working
In order to be able to make http requests from within a JavaScript you need a XMLHttpRequest object. While it is readily available in modern browsers, it is not so trivial in IE. The most popular way of doing this today is to use conditional compilation in IE.
/@cc_on @/ /@if (@_jscript_version gt= 5) ... code ... @end @/
Here is a tutorial (www.webpasties.com ...)
More Code Snippets on (www.fiftyfoureleven.com ...)
innerHTML vs DOM
The fastest and most simple way for altering webpages on the fly is to use the innerHTML propperty of a node. The problem is, that setting innerHTML will replace the node completely and with it all it's properties. So a new node will have to have its properties re-attached, even if it has the same ID as the old node.
If you use DOM for manipulating content, your code on the browser side will have to do more, since that data from the server arrives in XML and not pre-generated HTML as you would with innerHTML. The advantage is that you can do much more fine grain manipulations of the content.
DOM vs innerHTML benchmark (www.quirksmode.org ...)
Send JavaScript from the server?
The fastest way of communicating with the server is to send data encoded as JavaScript from the server and using eval() on the client. The advantage of this is that we do not have to parse the data actively on the client side, but we can use the browsers JavaScript parser to interpret the data. The client side can then access the data structures in JavaScript directly to generate the relevant HTML/XML code. This is especially efficient if you are dealing with large tabular data structures.
There is a standardized subset of JavaScript called JSON (JavaScript Object Notation) for this lightweight data exchange method. More about this on (www.crockford.com ...)
REST API
Communication between an Ajax-style UI and your Server should happen via the REST (Representational State Transfer) API of your Web application. (en.wikipedia.org ...) and (www.xfront.com ...)
Debugging JavaScript
The most simple way is to open the JavaScript console of your browser to see any errors the JavaScript engine generates. For automated testing you may want to use a JavaScript implementation that can be started from the command line like Mozillas Rhino (www.mozilla.org ...) project. There are also several Mozilla extensions that help:
LiveHTTPHeaders (livehttpheaders.mozdev.org ...) shows the HTTP headers exchanges between Mozilla and the server.
Venkman a graphical JS debugger (www.hacksrus.com ...)
Ethereal is also helpful as you can see what really happens on the wire. If you do not have the necessary permissions to sniff data off the wire, you may want to redirect your browser through a proxy where you can dump the data that traverses it.
Links
Examples (dojotoolkit.org ...)
Simple AJAX Toolkit (www.modernmethod.com ...)
About using Ajax on Rails the Ruby web toolkit (www.onlamp.com ...)
Ruby on Rails (www.rubyonrails.org ...)
An evolving browser UI toolkit with pluggable widgets that separate HTML from JavaScript Code (dojotoolkit.org ...)
CPAN for Java Script (www.openjsan.org ...)
Wiki with AJAX Framework overview (www.ajaxpatterns.org ...)
Nice widget and screen effects library for ajax applications (openrico.org ...)
Tuesday, August 02, 2005 14:04 // Portland, Oregon, USA // href
taught by Evan Lenz
XSLT is a language for processing XML documents. XSLT itself is written in XML. The output generated by an XSLT 'program' can be anything, but normally it is used for generating (X)HTML documents which can then be displayed by a browser.
XSLT uses the XPATH language for addressing 'nodes' in an XML document.
XPATH Expressions
An XPath expression is made up from several steps separated by /.
step/step/step
A step consists of three elements: The axis to identify a set of nodes relative to the current-context node. The node test to filter out relevant nodes from the set of nodes selected by the axis and finally any number of optional predicates to further filter which nodes get selected.
axis::node-test[predicate][predicate]
XPath expressions can return 4 types of data:
node-set which is a number of zero or more nodes without duplicates
number a floating point number
string a unicode string
boolean true or false
XPath knows seven different types of nodes:
Root - the toplevel node of a document is called "/"
Element - 'tags'
PI - Processing instructions lt?xml ...?gt
Comment - Comment tags
Text - Character data, including white space!
Attribute - lttag attrib="xxx"gt
Namespace - lt?xml-stylesheet ...?gt
When selecting XPath nodes, you can use different 14(!) axis. By default, you use the child:: axis. Default means, that you don't even have to mention it. So an expression like section is actually child::section. The other common axis is attribute:: it also has an abbreviation, called @ so instead of writing attribute::section you can write @section
The other 12 axis are: descendant-or-self:: which can be abbreviated as // and parent:: aka .. and self:: aka . the remaining axis do not have abbreviations: ancestor:: following-sibling:: preceding-sibling:: following:: preceding:: namespace:: descendant:: ancestor-or-self::
XPath example
XPATH looks at an XML document as if it was a filesystem. Like when navigating a filesystem, there is a context node where from where XPATH looks at the document.
ltarticlegt ltheadinggtHellolt/headindgt ltparagtParagraph ltemphgt1lt/emphgtlt/paragt ltparagtParagraph 2lt/paragt lt/articlegt
If the first para node is the current context node the expressions would return the following:
* - emph emph - emph .. - para ../* - heading,para,para ../para[2] - the second paragraph /article/* - heading,para,para
The XSLT Processing Model
The most important command in xslt is xsl:apply-templates. A lot of people do not use this function properly since they write only one big template, instead of writing many small ones for different purposes.
If you use multiple templates, xslt will invoke a conflict resolution protocol when several templates match for a particular element.
Selection happens by priority:
-0.5 match="" -0.25 match="xyz:" 0 match="name" .5 match="nameA/nameB"
You can override the priority of a template by setting the priority attribute explicitly.
Whenever XSLT processing is stared, the processor executes the template-rule for /. If you do not supply a template-rule the processor will use its built-in template rules (there is one for each of the seven node types).
Wednesday, August 03, 2005 08:34 // Portland, Oregon, USA // href
Tim O'Reilly
While many building blocks of software are now freely available (OS, database, web servers, browsers, ...) there is a growing number web based applications that take a more and more important role in everyday life that are not open at all. What about google-maps, what if google changes it APIs and you built a business on google maps?
The next wave of big applications will be web integration apps that pull data from existing websites and integrate, filter, consolidate it in new and interesting ways.
Kim Poleses of SpikeSource
She talks about the trend from "Do it yourself" to "Do it together" in OpenSource. We are moving from an EGO system to an ECO system with OpenSource makeing DIY-IT possible. Today, traditional companies start putting OpenSource tools into their business critical applications. IT organizations start putting packages together from a large number of OSS components. The challenge is that these components are all on different release schedules. The bigger the packages get, the more resources have to be invested into keeping these packages tested and up-to-date as new features are added. These custom software packages/stacks have become a competitive advantage for these companies, so they are going to stay. The problem the companies are facing is, that they have to become their own software company with its own testing, release system and everything a normal software vendor does. This costs a lot of money and thus opens new business opportunities for a new kind of service companies. They supply services where companies can offload part of their work in creating and maintaining their own software stack.
SpikeSource is specializing in OpenSource testing. They offer software testing services to companies, but also free testing resources to OpenSource developers.
Andrew Morton, Kernel Integrator with OSDL
How will Linux succeed on the Desktop? It will happen in the same way it worked with other OSS things, they are creeping from the bottom up along the food chain. Do not expect Linux taking the Windows Powerusers desktops by storm. These are the most difficult people to cater for, but they are not the majority, even though they can be very vocal. Linux is in use on many special purpose desktops today: Point-of-sales, trading floor, data entry. As companies see Linux working well in these specialized applications the interest in using Linux on more general purpose desktop applications will grow.
Wednesday, August 03, 2005 10:45 // Portland, Oregon // href
by Damian Conway and Larry Wall
We are almost there, yet! Despite the naysayers. We are close to finishing the design of the language that was never going to be finished. We are close to finishing the implementation of the language that was never going to be implemented.
What is new in Perl 6
Strict and use warnings are always on
No strict refs necessary since the language use a different syntax
No strict subs since Perl6 has no barewords
No raw assignments in conditionals. Since there is a new operator for that.
if ($a = max()){...} becomes if (max() -gt $a) { ...}
String interpolation is much simpler. No more problems with user@address.ch still there are many more sensible interpolation possible with the new syntax. The most simple approach is to add braces in a string their contents will be executed as perl code.
say "The current time is {localtime}";
Every scalar has a method to sprintf it ...
$score.as('%6.2f')
%hash.as('%-10s %2f',"\n");
Single quotes can be modified with 'adverbs' to interpolate some things.
The heredoc syntax has been modified to
print q:to/END/ blablbalb END
Note that the indation of the END marker specifies the left margin of the heredoc.
Indexkeys do not auto quote
%hash{larry}
would call the larry function. Now use
%hashltlarrygt
for autoquoted keys. This creates a problem with filehandles. Use this instead
while $fh.shift() {...}
or rather with the unarry iterator operator
while =$fh {...}
this operation is lazy, so you can use
for =$fh {...}
and
while (ltgt) {...} becomes while =$ARGS {...}
or rather
while =ltgt {...}
aka the fish operator. Filehandles automatically chomp any input.
In Perl6 every object has a perl() method that produces a perl representation of the data structure.
say $file.perl()
no more Data::Dumper.
There is a new reduction operator
$dot_prod = [+] @vec1 gtgt*ltlt @vec2
Perl6 has much better introspection capabilities via question referents
$?SUBNAME $?LABEL @?LABEL
The tell about almost any aspect of perls run-time as well as compile-time environment.
Play with Perl6 today, try pugs.
Wednesday, August 03, 2005 11:34 // Portland, Oregon, USA // href
by Brian Ingerson
Online (www.kwiki.org ...)
Pugs is just another CPAN module but it is a fully working Perl 6 interpreter. Pugs is implemented in Haskell which is a purely functional language.
Currently there are about 100 people with commit access to pugs and it is growing at a startling rate.
If you want to help with perl6 you do not need to know haskell, since a lot of code these days is written in perl6 already.
Wednesday, August 03, 2005 16:30 // Portland, Oregon, US // href
by Mike Shaver
Extensions for Firefox do everything from Add-Blocking to on the fly Website alteration with GreaseMonkey.
Extensions are composed of 4 elements. A manifest, Chrome (user interface components), Components (non-ui), Default settings.
If Extensions are popular and non intrusive to people who do not use them, they may get integrated into the default firefox.
User interface components are written in XUL, the same xml language Firefox uses for its own UI. XUL can insert new elements into the browser UI, it can add and alter attributes of existing UI elements and it even remove elements completely.
The best way to learn about writing extensions is to take a simple existing extension apart and modify it. Extensions are stored in .xpi files, these are just simple zip files with their file name extension changed. Unzip the file and off you go.
Extension writing in Firefox 1.5 was much simplified, so if you are starting out now, you may want to use the current firefox 1.5 snapshots (deer park) for development.
When you have created a usefull extension upload it to (addons.mozilla.org ...)
Analytical tools for extension writers: DOM Inspector, Venkman the JavaScript debugger.
Links
Books are ok as well, but things evolve, so while the base concepts are all the same you may have to adapt paths and node names to work with current versions of FireFox.
Wednesday, August 03, 2005 17:25 // Protland, Oregon, US // href
by David Heinemeier Hansen
Rails is a Ruby based Webapp framework that came out of the development of Basecamp. Basecamp is a commercial Web based project management software written by (www.37signals.com ...)
Davids idea when designing rails was to pull out the good ideas from other languages and platforms into a context adapted to small resources available to David at 37signals (0.25 programmers and 1 designer).
Instead of designing the framework before writing Basecamp, David decided to write Rails by creating a method of extracting the framework from Basecamp. So instead of making assumptions about what he would need in the future, he writes Rails in parallel with Basecamp. He calls this application driven development.
How to be successful with an OpenSource project:
Be visible. If no one cares for your OOS project you could as well not "opensource" it, since the cost for publishing will never be recovered.
Release only once the culture of the project is fairly set, so that new influences through new users will not bend the project out of shape. Make sure you can handle the contributions and bug reports from your users.
Increase your visibility by taking on the 'leader-of-the-pack' be careful about aggressiveness though, since if you are aggressive towards others, you will also attract aggressive people to your project.
Content © by Tobias Oetiker