Note: This content is accessible to all versions of every browser. However, this browser may not support basic Web standards, preventing the display of our site's design details. We support the mission of the Web Standards Project in the campaign encouraging users to upgrade their browsers.

Tobi Waves


INDEX | NOW | 2003|2004|2005 / 02|03|04|05|08|09|12 / 09|10|11|12|13|14|17|19|20

Programing on a Sunday

Sunday, February 09, 2003 23:58 // Radisson SAS, Västerås, Sweden // href

eye candy

A round-trip flight from Zurich to Stockholm costs 2280 CHF unless you stay over the weekend or you stop over in Prague then prices drop to 680 CHF. I opted for the weekend solution and arrived day early in Västerås. From Monday to Friday the annual Scandinavian System Management Conference aka NordU/USENIX 2003 will take place here. I am scheduled to give a talk on large scale System management on Thursday.

But I am skipping ahead, the Week has not even started yet. In Sweden, shops are closed on Sunday and on this particular one it was also rather cold outside. So apart from a short walk to lakeside I spent most of the day in the hotel room, writing my own web journal software.

I had hunted the net for a bit, but all that is out there seems to insist on doing all the editing over the web. What a drag. Why should I edit with a crappy text widget in a browser when my beloved editor sits on the very same machine. So I dug in, and finally after many hours a first version of the code is working and I am writing my first entry, I am planning to use this for writing reports on the talks and tutorials I am going to attend this week.

 

Not allowed in

Monday, February 10, 2003 09:26 // Aros Congress Center, Västerås, Sweden // href

eye candy

Still early today, and I have learned my first lesson: Always negotiate first and then make a deal, don't try to change it later. I was asked to give a talk at the NordU 2003 conference. My talk is scheduled on Thursday. I assumed now that I am here for the week, that I would get free admission to the tutorials, if there was space. Similar to what I am used to from the SANE conference. Unfortunately it seems that the rules here in Sweden are different. No free admission for speakers ... As I said, I guess I should have asked first, before picking the cheaper flight over the weekend.

Then again, the Internet access is quite decent here, the chairs in the terminal room are comfortable and I got a nice big Apple Mac LCD screen to work with. At the actual conference I can go to all the talks, so I guess I am not missing all that much. If things around here stay as quiet as they are presently I might even get some work done.

 

The Apple Mac Keyboard

Monday, February 10, 2003 15:02 // Aros Congress Center, Västerås, Sweden // href

eye candy

Here in the terminal room they have these beautiful new iMacs with the 1440x900 flat panel screen. After I had played around for a bit with all the nice gadgets and got impressed by the sleek performance of their new Safari Web Browser I decided to get to work and installed the new rootless X Server Apple is now offering for download. All went well and I was soon looking at my first xterm. Work could begin, so I thought, but then it hit me. Where the hell are all the keys on this Swedish Mac Keyboard? No {, no ], no * not even a $*. Basically everything I need for programming was missing. Eventually I discovered that while the keys were not labeled all the necessary keys were reachable using combinations of Alt and Shift. I went on and even found how to start the Xquartz server with the swedish keymap applied, only to discover the in this configuration the Alt key was acting as Meta and thus lost all its power to create any of my most wanted characters.

If I ever buy a mac, I would have to get a 3 button mouse and a proper keyboard with it. I wonder how programmers at Apple work? Maybe there is a special high clearance programmers keyboard available with all the secret keys labeled properly and working. This model which would wreck havoc in the Mac user population, if leaked to the unwashed, easily confused, masses. It would leave Apples hot-lines go into meltdown as desperate users try to get help in picking the right keys. What is the world coming to?

 

Tutorials here I come

Tuesday, February 11, 2003 10:31 // Aros Congress Center, Västerås, Sweden // href

Yesterday I have been telling various people that I found it rather annoying to not be allowed to go to tutorials. And guess what, just now, Kristen Nielsen of the program comity was here and told me that it was all a misunderstanding, and that I was welcome to join any of the tutorials I want. So I guess I will have to start writing conference reports sooner than I expected.

 

NordU 2003 Tutorial: Creating Happy Users

Tuesday, February 11, 2003 13:36 // Aros Congress Center, Västerås, Sweden // href

eye candy

Tools for Creating Happy Users by Tom Limoncelli and Christine Hogan

I only joined the class in the afternoon, so I can not report on the material covered in the morning. Below I have taken some notes on the topics I found most interesting.

Handling Support Calls

The first topic was "handling support calls". Tom made a case, that handling support calls can be taught and is not just something one knows or doesn't. On the first level, each call can be broken down in to: Greeting, Whats wrong? Fix It, Verify It.

After the greeting, the second step is to get a complete problem statement. Where, When, Who, What? No special tech knowledge is required here, just social skills, active listening and probably some checklists. A complete problem statement will make the actual problem solving much simpler.

Getting a problem statement is already quite sensitive. Eg. one should never ask things which encourage the user to lie: "Is it plugged in?" is bad. Better: "Lets check both ends of the power cord". When a problem statement has been created the problem has to be verified/reproduced/scripted, because otherwise there is no way of verifying the effectively of an eventual fix.

The third step is where the tech/fixing skills are really important. In trying to find a solution, it can be helpful to involve the user into selecting which approach is taken, to fix the problem, and thereby taking their situation (eg time pressure) into account. Keep in mind that you want to know what caused the problem to go away, so only do One thing at a time.

The forth and probably most important step is, that both the person who fixed the problem and the user who reported it verify and agree that the problem has been fixed.

Unless you monitor something, you can not call it a service!

Customers rely on the services we provide, so if there is a problem with one of the services, we should know before the customers do. "Yes, we are already working on it, we expect to have this fixed by 10am, we will send out mail when it works again." is much better than. "Oh, for how long has the company website not been working anymore?"

Monitoring allows to find patterns in breakage, it assists in capacity planning and if combined with a knowledge base, solutions to known problems can be shipped out with the alarm.

My points: Optimize the monitoring for the case when all is well, because this is the state it will run most of the time. Don't have the monitoring system take 'counter action' because this leads into to the 'nanny trap' where successive layers of software are fixing the problems of the underlying system instead of fixing the root cause of the problem. Check out (isg.ee.ethz.ch ...) and also (people.ee.ethz.ch ...) for two tools in this area.

Want to learn more?

Incidentally Tom and Christine have written a book: The Practice of System and Network Administration (www.amazon.com ...) which expands on the soft topics of system administration.

 

NordU 2003 Tutorial: Solaris Internals

Wednesday, February 12, 2003 17:00 // Aros Congress Center, Västerås, Sweden // href

eye candy

Solaris Internals: Architectural Tips and Tidbits by Richard McDougall richard.mcdougal@eng.sun.com

In this tutorial, Richard highlighted various components of Solaris. Below I have listed the things I found interesting with an emphasis on Solaris 9 features and Solaris 8 tricks.

The Solaris kernel is preemptive. There are only very few non-preemption points in critical code paths. Combined with the threaded kernel this allows for very scalable handling of IO interrupts.

In the 64bit transition of Solaris only long and pointer were changed from 32 to 64bit the other types stayed the same size. (32bit -gt ILP 32, 64bit -gt LP 64).

Oracle profits from 64bit by being able to cache all diskblocks and thus acces data without system calls. Memory mapping a whole database is not unreasonable with todays memory prices.

64bit code has bigger pointers and longs it has to move more data. This which results in a  5% performance loss. If other 64bit features are used this loss of performance is not significant.

Solaris 8 has a new modular, live kernel debugger called mdb which replaces adb and crash. It allows to look at things like the list of open files and other kernel data structures. In Solaris 9 adb and crash are removed. In Solaris 9 mdb even understands c structs defined in header files. It uses this information for "nice" memory dumping.

The kstat command is used to show kernel statistics: eg. kstat -n~system_misc. Kstat is written in perl and uses a perl module to access data. This module can be easily used in other programs.

About the Solaris Virtual Memory System

VM is all done on demand, pages get loaded as they are used, but the memory reservation (in swap) happens on exec, fork and break. The effect of this is, that a lot of swap gets reserved but it does not actually get used as long as it does not get touched by the code. Forking apache reserves the same amount of memory for the forked copy as for the original but it only copies the bits of the original that get modified (copy on write). The advantage is that Solaris will be able to actually provide all the memory a process got allocated in the first place and will not die halfway through the operation because it promised more memory than what was actually available. (Linux and AIX are less conservative as they hand out reservations without looking at the available memory.)

The loading of programs uses a similar process where the file on disk becomes part of the OS memory (it gets memory mapped) and then get successively loaded (demand paged) into RAM as when parts of the program get executed.

When multiple copies of a program get started (or forked) they will all share the same memory as long as they do not modify it (copy on write).

How to determine how much memory a process really uses? The text segment of each process is read only so all instances of the same program or shared library are shared. The data segment is initially shared but will get split partly as the programs modify their statically allocated memory and copy on write happens. With

pmap -x  PID

I get a detailed report of what memory the process really uses. The interesting part is the total of private memory (or anon memory in Solaris 9) which shows the amount of memory which is exclusive to the process. Solaris 9 gives us pmap -S which will show how much memory is reserved in swap.

Memory Access Speed

Solaris 9 U1 has some support for NUMA (non uniform memory access time architectures) architectures. This means that Solaris can handle the fact that different sections of the memory have different access times. While Suns multi CPU boxes are essentially symmetrical, so all memory has the same access time, the larger variants like Ex800 or F15k are slightly NUMA. Solaris 9 U1 deals with this fact by building memory latency groups and taking the access time relative to the cpu into account when allocating memory to a process. This happens automatically but there are also APIs for influencing this when writing programs.

How to determine if there is a memory shortage

Solaris 8 has an all new paging system which has much better performance than previous versions. The vmstat -p~3 command will show a detailed report on the paging activity. By looking out for high number in the Anonymous column this indicates that there is a shortage of physical memory. So this is the way to check if any bad swapping is happening. Make sure all VM tunables are removed from /etc/system when migrating to Solaris 8.

In Solaris 9 the mdb -k command ::memstat gets a detailed list of memory usage.

CPU Usage Accounting

In Solaris 8 interrupts (eg from the Network Card) are accounted as idle time while Solaris 9 does account them correctly.

With trapstat in Solaris 9 it is possible to see how many interrupts that are occurring.

How to determine where the kernel is spending it stime

The command lockstat -kIi997 sleep 10 will monitor what the kernel is doing for 10 seconds and sample the kernel threads 997 time a second and then show you what it found. lockstat sleep 10 will tell how many locks of what type occurred where within the last 10 seconds. The mpstat 1 will show some per processor statistics. Especially the column Intr will show if the box is spending a lot of time being interrupted. If Smtx is high, this indicates that the CPU is spinning on mutexes while it unsuccessfully tries to acquire a lock.

Kernel level Threading

Up to Solaris 8 the default thread model was based on the idea of a user-level scheduler mapping user-land threads onto kernel threads (LWP). Thread switching happens on blocked threads only. So this is almost like cooperative multitasking with the obvious problems. The nice thing is that this can be very fast as the handing of between threads is very light weight and could create very efficient programs. Unfortunately the whole system is complex and quite unreliable the massive number of thread patches is a witness to this fact.

An alternative threading library is sitting is /usr/lib/lwp it turns all threads into kernel threads and scales really well because there is now only one scheduler. All the fairness problems go away. What is really cool about it, a simple

LD_LIBRARY_PATH=/usr/lib/lwp program

will make a threaded application use the kernel threads. Note that this is only working really well from S8U7 (kernel jumbo from feb 2002) in Solaris 9 the new threading lib is the default.

A better top: prstat

Solaris 8 includes something like top called prstat it has various options to show all sorts of statistics: -m will show per process microstat where it is possible to see much better where a process is spending its time. With out options it acts just like top. With -t you get stats on a per user level.

Tricks with truss

See what is happening in the program besides system calls

truss -d -u a.out,libc program

Find out in which system calls take how much time

truss -c program

Filesystem Tuning

Machines which access many files concurrently might profit from setting ufsninode and ncache to higher values. Searching for "system tuning manual" on (docs.sun.com ...) will give more information. This could also be of interest for Machines running diskless which access many different files concurrently over NFS as inode cache misses are even more expensive.

In Solaris 9 there is an option to get checksumming on the device driver level to make sure files do not change on disk.

The kernel parameters ufs_LW and ufs_HW make sure that UFS does not consume too much memory (write throttling) this has a rather negative impact on performance as their default value is too low. In Solaris 9 this is therefore set to higher values. Richard suggested the following to be put in /etc/system of Solaris 8 systems especially if they have a lot of memory:

set ufs_LW=4194304
set ufs_HW=67188864

Check (kr.sun.com ...) and (www.princeton.edu ...) for some information.

Another important tunable parameter is maxphys. It sets the maximal junk of data the filesystem will write to disk in one go. For SCSI this is ste to 128k by default which is way too low. Richard suggests to set this to 2 MByte.

set maxphys=2097152

Check (206.231.101.22 ...)

To learn more ...

For more information, Richard recommends his Solaris Internals book: (www.amazon.com ...) or (www.solarisinternals.com ...)

 

The effects of a full day Tutorial

Wednesday, February 12, 2003 17:09 // Aros Congress Center, Västerås, Sweden // href

Yesterday Tom Limoncelli said that he thought that one day tutorials were probably the best method for learning new things. Today I did the 'self experiment' and spent the whole day in Richard McDougall's Solaris tutorial. Only minutes after the tutorial has finishes I can confirm that I really do feel exhilarated and would like to try out all the thing Richard touched upon. But looking more closely at what exactly it is, that I learned from today's slew of slides and explanations, all that remains are my notes, a headache and quite a number of areas I would like to investigate. I have not yet learned anything in the sense that I have tested and applied what Richard has been explaining in the real world. So even though I would have never "learned" as much as I heard today, I guess I would have profited more from less information and more hands-on training on real world problems. Or if hands-on was not possible, then probably paper exercises where I had to think up solutions which then would have been discussed later on.

The main problem with this more thorough approach is, that it would be way less sexy than the information blast method and people who pay for these tutorials want something out of them. At worst people might complain, that while they had learned how to make their Sun run faster they effectively had figured it out on their own and wondered why they had to pay so much for an instructor to only ask them questions.

 

NordU 2003 Talk: Open Source at Turku City

Thursday, February 13, 2003 08:29 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Eija Onnela eija.onnela@turku.fi

Turku is a city in Finland with 173'000 inhabitants, 13'000 city employees, 5000 workstations, 150 man/years spent on IT every year. 54 people in central IT.

Reasons for looking into OpenSource: a) OpenOffice in Finnish, b) new M$ licensing policy, c) report on usability of OpenOffice and Linux in Turku City http:///www.turku.fi/english/administration_economy/it_department.html

Test Setup

Test setups were created for Linux (1 person) and Windows (4 people sponsored by MS)

The goal of the upgrade is to simplify system management while keeping the user experience at a good level.

The Linux Software Environment is based on Suse using Webmin for administration and OpenAFS for home directory access. On the Application side they used OpenOffice and Netscape. Installation was done via CD because of the slow network environment. All running of a single Linux server.

The Windows Setup was done with RIS through PXE. Office on the Software side ... special application distributed through SMS. 9~Servers.

Problems

On the Linux side the problems were mostly because of interoperability with old office documents and the fact that the users are not used to the Linux environment and many small application which were not available on Linux.

On the Windows side, problems were mostly because of applications which were OK on Windows NT did not work on Windows XP.

There is a lot of resistance from the user side and from department admins regarding a switch to Linux. Users do not want to work with a different environment and local admins do not want to change to a centralized solution.

Conclusion

Two TCO analysis projects are still underway to determine the financial implications of the two solutions. A decision has not been reached yet on whether to go for Linux or Windows.

(www.turku.fi ...)

Eija is under quite a lot of pressure currently because all major cities in Finland are waiting on the outcome of the Turku project. And things are not looking all that good for linux because of missing applications and users as well as local admins stalling. Maybe Cytrix and terminal server can help.

 

NordU 2003 Talk: New Features in Solaris

Thursday, February 13, 2003 10:26 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Richard McDougall r@sun.com

Reasons for Big Memory and thus 64bit Solaris

Machines with up to 500 GB of memory are possible. this opens new possibilities like for example keeping huge databases totally in memory and thus eliminating all the read performance problems on the file system level.

UFS in Solaris gt= 8

File creation is 10 times faster, file system creation is magnitudes faster, directory lookups scale linearly with directory size.

New Tools in Solaris 8

prstat (a better top), mdb (successor to adb and crash), lockstat -k (kernel profiling), kstat (command and perl library for kernel statistics), extended truss (traces library and program calls), new accounting system, cpustat for cache and bus statistics.

Solaris 9 Resource Management

The RM is a Infrastructure to automate performance management.

Traditionally machines had to be sized quite big because the workloads were very uneven. With RM it is possible to add workloads at a low priority and thus use all available CPU time without disturbing the main task on the machine.

RM allows to group processes into projects and assign resources to them and also do accounting on them. In /etc/project (or via LDAP/NIS+) you can define process groups by program, user and group and assign resources to each group. With the newtask command a program can be explicitly assigned to a certain project and thus gets access to the respective resources.

Resources are defined in pools which allow to select number of CPUs and the type of scheduler to be used. The projects are assigned to pools which then define the resources available to the processes in a project.

The resource constraints facility lets you send signals to programs violating resource limits or also deny them access to resources.

Many of the Solaris performance tools know about the projects concept and can report based on projects instead of processes.

Relevant commands projects, proj{add,mod,del), newtask, pooladm, poolcfg, poolbind.

Check (www.sun.com ...)

 

NordU 2003 Talk: GDB old Dog new Tricks

Thursday, February 13, 2003 11:11 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Andrew Cagnery

Gdb is the most widely used debugger, only MS is still doing their own thing, most other companies have switched to gdb totally or are at least helping it succeed.

New Tricks

Languages C, C++, Java, Fortran, Scheme, Modula-2

Expression parser understands function expressions written in the language of the program and can evaluate them on the fly.

Remote debugging for debugging embedded systems remotely with gdb server.

Program tracing with trigger points to do on the fly monitoring without stopping the program.

Out in the next few weeks: tui the gdb split screen, curses based text gui.

The next version will know about multiple architectures. This means a single instance of gdb is able to remote debug code on different architectures. The eventual goal of this is to eventually be able to transparently step into remote procedure calls.

GDB is introducing a new interface called MI (machine interface) to simplify the use of gdb from front end programs. There are very strict criteria on changes to this interface to ensure that front-ends can rely on the stability of the interface.

Handle debugging of optimized code with CFI (gdb 6)

Old Dog

Gdb 1.x was out 1986 for SPARC, VAX, Tahoe and GOULD ... Andrew is looking for it.

Not really big new features since 1991, mainly new architectures were added. Almost any cpu ever designed is supported by gdb (and gcc).

Still the code base is growing exponentially they are at 1.5M lines now.

A few years back gdb supported 36 architectures. As this is difficult to maintain they have been actively eliminating old code ... they are down to 22 now.

Code Quality Improvements

Select -Werror fags for zero warning tolerance.

GNU Coding standard. ReIndent with GNU indent. Strict ISO C, Eliminate subjectivity. Use GNU indent and don't argue.

GDB specific lint which checks for various common problems. Code which does not pass is not accepted.

Move to opaque objects and avoid globals.

(www.gnu.org ...)

 

NordU 2003 Talk: The quest for the lost hardware docs

Thursday, February 13, 2003 11:58 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Jes Sørensen jes@wildopensource.com

Jes has been working on the Linux kernel for the last 10 years. He is specializing on driver development. Currently he is involved with the consultancy Wildopensource.

The Basic Problem

Drivers needed to talk to hardware

Documentation lets us write better drivers, as we have to guess less. Not that documentation would generally really describe what the hardware does in reality, but it is a start.

Binary drivers are a problem. As kernel API changes without regard for binary compatibility with existing drivers.

Open code is generally better because of peer review.

Why are people not releasing specs?

Many companies think that they will loose competitive edge if the competitors know how to program their HW.

Jes makes a case that this is not true as giving programming information is not a real IP issue anymore as today the core of a companies IP is mostly within the chip and not in the interface.

Another problem is that many companies like nVidia for example, do not own all their interfaces, due to cross licensing and patenting issues and are thus not allowed to release source.

Convincing People to release their Specs

Having a driver in the official kernel gets it some automatic maintenance as the code is updated with the kernel, or at least compatibility problems will be discovered more quickly.

Better public acceptance due to a good image within the community and thus better sales.

Free help for debugging the hardware as external driver writers tend to find new problems.

Addressing Execs

Engineers are normally not a problem, they like to share information and help each other out within the limits of the environment they work in.

NDAs are acceptable if they just protect the documentation and not the code written based on the information gained from the docs.

GPL helps as it ensures that the source can not be taken by competitors and included in their closed product.

Flaming and yelling shuts doors. Good behavior helps. Don't use SlashDot.

Petitions might help, but only if the company is interested in this.

What todo when told NO?

Look for alternative vendors, OEMs might be more friendly. E.g. Broadcoms OEMs.

Sometimes new chips are largely based on the previous model and thus the interfaces are similar.

Use reverse engineering, but beware if you live in a non free country like the US where the DMCA can send you to jail for years. In the EU, currently interfaces can be legally reverse engineered for the purpose of interoperability if the vendor refuses to give specs. Also be aware that some countries believe to have jurisdiction everywhere.

Reverse Engineering

Take a close look at existing drivers for other OSes.

Snoop drivers' register access.

Use srandom to figure out the correct access sequence (Andrew Tridel of Samba Fame used this to figure out how the Vaio Picture Book Camera works) .

To avoid licensing issues, get a friend to read the specs and have him tell you how it works.

How to Write a Driver

Do not use a compatibility Layer. Write the driver Linux specific.

Examples

Jes has been using these techniques to write drivers for Alteon, Intel EEPro 100 and Broadcom 570x.

Alteon is no more, with Intel there are now excellent links and they even submit patches. Broadcom was really difficult to work with but due to pressure from their customers they seem to come around.

 

Ufff, NordU talk done

Thursday, February 13, 2003 15:08 // Aros Congress Center, Västerås, Sweden // href

I really love speaking in front of an audience. This is why it is so easy to convince me to come to conferences. During the last hour I finally had my own talk here at the NordU conference. I was talking about scalable system management concepts in a large environment. Presenting the major tools we have developed at the ISG.EE. There were not all that many people in my talk, but taking into account that only slightly more than 100 people at the conference and that there were 3 sessions in parallel plus a vendor exhibition I am actually quite happy. I think I drew over 30%.

Oh yea and I held the set time of 45 minutes exactly. I finished my talk 2 minutes before the alloted time with some break halfway through for questions. Now I just need to find a way to loose that adrenalin to be able to concentrate on other talks again.

 

NordU 2003 Talk: Linux on the Itanium

Thursday, February 13, 2003 15:24 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Bruno Cornec from the HP/Intel Solution Center.

HP up to the management level is now taking Linux seriously. They finance most of the ia64 and wireless work. They employ several key Linux developers for example Jeremy Allison of Samba Fame.

Itanium is HPs future. All operating systems the users require will be provided. This includes Linux, Windows, HP-UX and OpwnVMS.

Itaniums are a new architecture co-developed between HP and IBM. It includes hardware IA32 emulation. The chip includes the Floating Point Unit from PA-RISC and is thus very fast in this area.

While Itanium is available to whoever wants to buy it from Intel, HP has developed their own high performance chip set for the Itanium 2 which they hope to gain competitive edge from.

HP is not only working on the ia64 architecture but also supporting ports to PA-RISC and Alpha.

HPs David Mossberger is responsible for the linux ia64 port. His main focus in doing the port is to comply with all the unix standards for 64 bit as well as keeping the ia64 port close to the ia32 version to ease portability. The ia64 port also includes access to the ia32 hardware emulator.

Several vendors already provide Itanium compatible products: Intels C Compiler and Oracle, Side Effects Houndini, MSC.Linux, MSC.Nastran, SCI, Quadrics drivers, Myrinet, SSI, Alinka.

HP is supporting external developers in improving the gcc code generation for the ia64 in order to get it on par with Intels compiler.

HP is working with INRIA on porting MandrakeCluster to the Itanium Platform. (clic.mandrakesoft.com ...)

Tips for porting to the Itanium

Alpha thing will just work.

Pointers and Longs are 64bit.

Big-endian is settable for certain programs as required.

Use int32_t, int64_t, u_int8_t

Compile with -Wall and take the warnings seriously.

 

NordU 2003 Talk: StarOffce and OpenOffice in Hanstholm

Thursday, February 13, 2003 16:28 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Jens Ole Hald of Hanstholm City

Another City switching away from MS Office: Hanstholm in Danmark. Jens Ole Hald of the City IT department tells us how and why they did it.

A Testing Group of 15 Users has been evaluating StarOffice for 2 weeks in spring 2002. Since November 2002 there are 300 Employees working with Staroffice and OpenOffice.

Most problems were with reading Microsoft formats but even those were minor and got mostly fixed in the meantime. Some documents need minor re-formating when opened for the first time but this is not really a problem. Internal Problems with StarOffice were not found.

Users got a 3 hour up-lift course for StarOffice to make them ready for the new tool.

At the moment the Workstations are still running on Windows. But they are looking on moving over to Linux.

On the server side they want to stay with Novel. Quote "You have to know and do a lot to make a Windows or Unix box secure. About as much as you have todo and know in order to make a Novel box insecure."

To ease the transition for the Users, the local admins have produced templates and some custom icons and menus mimicking MS Office.

Reasons for changing

The reason for changing was primarily Microsofts new more expensive licensing scheme.

Hanstholm was already (or still) using terminal based programs. On IBM Mainframes and Unix Servers. They were mainly using Word and Excell from the Office Suite.

Unix was already deployed on the server in certain areas like Web and Proxy Servers.

Initiating the Transition

In summer all employees were invited to a presentation where the head of the cities administration introduced the new application and also made it clear that the decision to move to OpenOffice was taken and could not be changed. This set the tone so that the acceptance of the new program was very good and people were mostly interested in learning how to use the product and not in discussing if they want to use it.

Problems

Users who were very experienced with MS Excel had the most problems with the transition as things in the OpenOffice Spreadsheet are working slightly different. But then again it is probably mostly due to them not really accepting the change yet. They will now get a special 1 week introduction to OpenOffice.

 

NordU 2003 Talk: The Future of JVM performance and innovation

Thursday, February 13, 2003 17:16 // Aros Congress Center, Västerås, Sweden // href

eye candy

IBM has setup a special group concerned with improving the performance of Java. Robert F. Berry of IBM tells us of their efforts.

JVM innovation is manly driven by performance enhancements. It started out on the client side, but today Java is relay big on the server side.

Java performance on a specific hardware has developed into a major selling point.

Performance Improvements

In the memory management area, an enhanced fully threaded Marc/Sweep/Compact algorithm was developed which uses system idle time for marking and does incremental compaction.

IBMs Just in time Compiler (JIT) uses an aggressive in-lining technique which gives the jit much more code to look at and optimize. Object allocations can be improved by static analysis of their locality and then probably allocate them on the stack and thus also save on synchronization time.

Restarting a JVM is expensive, but from a transaction isolation point this is a useful concept. To make this a viable solution a JVM start and clean mechanism has been developed where several JVMs are sharing part of their environment. The startup time for an additional JVM has been reduced by about a magnitude.

Future Work

Footprint Size

Very Large Heaps gt 500 GB

Very Large Systems (n-Way Servers)

Object Pooling (e.g Jakarta Commons)

Improve decimal arithmetics for banking transactions

Improve performance on XML and XST workloads for Webservices

Conclusion

I find it rather hard to write a report on a topic I am not really fluent in :)

 

NordU 2003 Keynote: Talking to the Walls

Friday, February 14, 2003 08:32 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Mark Burgess

The increasing availability of mobile communication devices changes our society. Appointments become fluid and can be changed up to the last minute as an SMS will inform the other party of the new location in space and time. People who are physically in the same room can easily avoid any communication with each other as they keep connected to their "friends" over their mobile devices.

On a sociological level the availability of mobile communication devices has not yet been integrated into the framework of our social standards. Where is it appropriate to use a mobile phone? Does the availability of a connection to your peer grant you permission to change appointment times at the last minute? Is it acceptable to have a mobile and turn it off?

Challenges for system management in this context are: Diversity as many different technologies will be around until (if) a unified standard emerges. Stability in the face of environmental noise.

We must find new ways of keeping the systems within our realm of responsibility in some organization. Firewalls make little sense in an environment with a wild mix of interconnected private and company devices. VPNs are giving a hint at things to come.

How can we find system management methodologies for diverse, mobile and changing device populations. Looking at natural behavior of birds (swarms) or ants (hives) give clues on how organization can work in such environments. Even today kids can be seen to swarm around town governed by SMS messages they exchange.

For system management this means that strict central control is a thing of the past. Maybe stable existing structures which allow for other devices to integrate them self will work. There is no point in trying to stop the advance of these new technologies. We rather have to integrate them into our environment and adapt the environments to them.

A secure system is one where the risks are known and have been deemed acceptable.

 

NordU 2003 Talk: Injecting RAS into Linux

Friday, February 14, 2003 10:33 // Aros Congress Center, Västerås, Sweden // href

eye candy

by Richard Moore of IBM

Richards group is occupied with getting RAS (Reliability, Availability and Serviceability) into Linux.

In order to get Linux established with previous "Big Iron" customers, a whole new set of requirements becomes important.

As Reliability is not really achievable, the aim is to reach pseudo reliability which means to hide failing elements of the system from other system components, probably taking a performance hit while doing so.

In a 2 CPU machine the failure of a single CPU can be recover gracefully by shifting the workload to the one working CPU.

The Serviceability component means that the system must have the means to detect failing components, best before they fail completely and then replace them. Serviceability is not limited to error detection but encompasses all elements which make a system serviceable. So this includes manuals, problem correlation, debugging tools, logging. Compared to what is available for IBM 390 the Linux offering is still in an embryonic state. Many tools are available, but they are often not yet integrated into the mainline kernel nor is there a consensus on which tools to use. The consensus is still Syslog which is not easily machine parseable and thus does not lend itself to automation.

The big advantage of Linux is that there is virtually no old code in the system compared to old systems like Windows or big iron machines which have a rich heritage of old code. Linux developers have a tendency to not shy away from ruthlessly eliminating bad code. They rather break an interface then keeping a bad one around. The effect is, that the systems are cleaner and easier to maintain.

The documentation problem gets solved in part by the much better code quality in Linux (due to peer review) and the extreme size of the kernel developer community. Also because source is available a lot of documentation is in the source itself.

Due to the versatile workloads of Linux systems all functionality in the serviceability must be tailored to the specification of the local setup. There is not much use in a mobile phone dumping core into its flash ram.

 

NordU 2003 Talk: FreeBSD 5.0

Friday, February 14, 2003 11:18 // Aros Congress Center, Västerås, Sweden // href

eye candy

Murry Stokley of FreeBSD Mall is one of the FreeBSD release engineers. In this talk he told about FreeBSD development, organization and what is new and cool in FreeBSD 5.0.

Development and release management

Everything is on CVS

There is a current branch and a stable branch. Material from the current branch gets merged back into the stable branch when they have gotten enough testing.

Over the last 12 months 160 people have committed code directly via cvs to the FreeBSD kernel. Non commiters are welcome to submit patches via the gnats bug tracking system.

FreeBSD is highly organized with elected leadership, developer documentation, release engineering, core team.

Tinderbox environments constantly test the current release.

Release 4.x remains supported in the foreseeable future as most FreeBSD sites have very high stability requirements.

New Features In 5.0

Support for kernel scheduled entities which leads to better threads

Device file system

Bluetooth and Firewire support

Mandatory Access Control

UFS2 with bigger inodes to store extended attributes

GEOM - modular disk I/O transformation framework

Device monitoring daemon devd to manage pcmcia and other plugable devices

Soft Updates (fs enhancement) with snapshots and background fsck

No more perl dependency in the base system

New platforms: ia64 and sparc64

More information is on (www.freebsd.org ...)

 

NordU 2003 Talk: Early Userspace

Friday, February 14, 2003 11:50 // Aros Congress Center, Västers, Sweden // href

eye candy

by H. Peter Alvin of Transmeta

As the Linux kernel developed, the root file system became more important and had to be able to live in more and more different places, like configurable locations on the disk or even on the network. Eventually even in RAM as initrd entered the scene. This caused all sorts of special cases needing handling to support all these variants. And all this is happening inside the kernel which is tough for development as testing is really painful. So the ideal case would be to be able to organize the booting process in user space.

The solution is to have a virtual root for the system, called rootfs using ramfs code. As the kernel starts, / is rootfs and the actual root filesystem becomes a simple overlay mount. This means that it is possible to use an initial ram disk and get rid of it later. As the kernel threads live in their own world (rootfs).

To simplify matters further, initrd gets replaced with initialramfs which is populated by loading one or several cpio archives. The cpio archives can come from the disk, from the net or even be compiled into the kernel itself. They provide the files required in early user space. To allow initialramfs to be small and still provide a useful environment, a special stripped down C library called klibc was developed to provide library code for this case.

Programming in this environment is almost as simple as normal userspace programming. Malloc works as expected, file and socket handling is there. Still there are restrictions, as all the rest of the system is not up yet. STDIO is available but it is very slow especially for reading, as klibc does not implement buffering to save space.

Candidates for early Userspace

With this infrastructure the possibilities become endless. The following candidates for early user space come to mind:

Partition detection

Determining the root filesystem type

Network booting

Caveat

Note that this is very much an ongoing development and only available in 2.5.x. (www.kernel.org ...) has more information.

 

My So-Called Life

Monday, February 17, 2003 00:15 // Feldstrasse, Aarburg, Switzerland // href

For the last few weeks Regula and I have been watching the 1994 drama series My So-Called Live. I totally love this show, too sad it got canceled after only 19 episodes. It's the story of 15 year old Angela Chase, her family and high school friends. I won't attempt to tell the story of the show as it is not really the story which makes it live. It's more the depiction of Angela's "so-called" life, as well as the lives of the people around her.

I seem to have a knack for getting hocked on TV Series that get canceled (Farscape is another example). Here I knew what was going to come, as I had bought the DVD box-set long after the fact. All the more amazing to see that fan community is still alive and kicking (www.mscl.com ...). Because the series ended so abruptly it offerd fertile ground for fan fiction, also called episode 20 fan fiction.

After watching the penultimate episode tonight I went on the net to research if the people behind MSCL had done other things I might want to see. The producers of the Series are Ed Zwick and Marshall Herskovitz of the Bedford Falls Production Company named after the setting of "It's A Wonderful Life". They started working together in 1983 and have since produced several award winning shows, unfortunately not all with great ratings. Some died an early death like MSCL. First there was the all around successful thirtysomething, then after MSCL (www.amazon.com ...) there was Relativity which also got rave reviews but low ratings. Now they are back with Once and Again (www.amazon.com ...) where ratings and reviews seem to be more in sync again.

Unfortunately there seem to be no DVDs of thirtysomething available, but the first season of Once and Again was just released on DVD, so there is at least something to console me when the last episode of MSCE will be watched tomorrow night.

I also found a lengthy article about Zwick and Herskovitz and Bedord Falls at (www.angelfire.com ...)

 

Windows Blues

Wednesday, February 19, 2003 00:01 // ETZ J97, ETH, Zurich // href

Our department is taking part in the ETH Laptop Project. This means, we are helping our students to make better use of their laptops. Currently this means we are developing a Linux and a Windows setup tailored to the requirements of our students. These setups will make it simple for them to integrate their laptops with our Unix Environment. We also have struck a deal with IBM which offers the students IBM Laptops at competitive prices and we will put our own Windows and Linux on these boxes.

Today I have been trying to get the IBM Windows XP installation which is already on the laptop when the students buy it, into a form so that it contains all the latest security stuff and fixes from MS and updates from IBM as well as our locally developed packages. When all the stuff was in, I used the sysprep tool to 'reseal' the machine, so that when the students boot it, it will come up with the usual short setup where the user can define the admin password and has to enter the serial number. Well, that was the plan at least. When I tried to reboot after the sysprep step, Windows came only halfway up and then complained, that setup could not continue because two processes were accessing the registry. BOOM. Reboot.

Over the course of the day I tried the whole spiel in many variations, searched the web, hunted through newsgroups. As every try took about 40 minutes, this problem was really painful to debug. Eventually and counter to all I expected, it finally worked. Unfortunately I had twisted and turned so many knobs that I am not sure which one was actually responsible for the sudden success. So tomorrow I will be at it again, trying to verify my recipe for success.

I so am glad that I can mostly work with Unix systems and only have to use windows occasionally. Whenever there is a problem with a windows box I feel like I was forced to wear a thick winter gloves while trying to repair a watch, blindfolded and someone occasionally moving the watch around.

But hey, I am stronger than windows! Eventually it sits up and begs for food, but the process always is extremely annoying.

 

Excessive Retransmits

Wednesday, February 19, 2003 22:31 // ETZ J97, ETH, Zurich, Switzerland // href

Today around 9am our main Solaris server started acting up. Its performance got patchy. We eventually found that it was suffering from excessive TCP retransmits of up to 1000%. This means that for each packet it sends out on the net it has to try 10 times until it is successful. This is an extremely hight value, or so Virtual Adrian tells us.

We started searching franticly for the reason of the problem, as performance on the server and even more on its clients was suffering badly. After about one hour of web hunting with and traffic dumping, we gave in to the pressure from the street and rebooted the beast, hoping that probably some internals of the kernel had been thrown out of whack and after a reboot all would be well. And indeed it was, at least for a few minutes. Then the server started misbehaving again, driving its TCP stack through the roof. As rebooting did not help, we went back to tcpdumping and etherealing. I did learn a lot about pcap filter syntax ...

'tcp[tcpflags] amp (tcp-rst) !0 ampamp tcp[tcpflags] amp (tcp-ack) =0'

but nothing about the reason for the retransmits. Fortunately, at this stage, the retransmit rate was not always at 1000% so work was possible for our users.

Then, in the early afternoon, Manuel found that the root disk of the server causing SCSI timeouts. As if we didn't have enough on our hands already. SCSI timeouts make the machine stop and wait for several seconds at a time. Together with the server, most people using its resources, were experience the same freezing problem on their workstation.

What a day. I have been writing emails about what was happening to our users all day long, but things were really stating to look bad. Our wonderful reputation for high quality service and superb uptime was going down the drain. It seemed though that most users were not blaming us at this stage, probably due to the fact that I kept them up to date with what was happening.

Around that time David found, that in the latest Solaris kernel patch there was a fix for some TCP stack issue which might be related to the retransmits we were still suffering from. He started to put in this patch so that we could activate it when we rebooted. This was going to be necessary anyway as I was preparing to replace the root disk with a fresh device.

Then, suddenly just minutes before the reboot, the server went back to normal, the retransmits were gone and performance was good again, no traces left.

So here I am, another day older and not much smarter about what was causing todays network problems. I can imagine things like that there is a bug in the Solaris TCP stack which can be triggered by a rouge packet and this would cause the symptoms we experienced today but I suspect, once the real reason is known, it will be way less spectacular.

 

A Fairytale

Thursday, February 20, 2003 23:57 // Feldstrasse, Aarburg, Switzerland // href

Once upon a time, there was this firm, the little Toy Factory, they were building these neat and simple woodblock toys. Kids could use them in various setups. Clever kids could even create their own additional woodblocks and hoock them on. Woodblocks seemed quite simple, but the trick with these toys was their clever overall design which made it easy integrate them with other Woodblock toys and even create your own additions. The factory was really successful with the kids who knew about their toys. Partly also because each Woodblock toy also came with a complete manual explaining not only how to use it but also giving detailed account about how this particular WBT had been constructed. Woodblock toys guaranteed for hours of satisfying and creative playtime together with your friends.

Not far to the north west there was this other company, the Lolly Makers, they produced shapely and tasty lollies in many colorful designs. Kids who tried them were really taken by the great taste of the sweet lollies. The lollies sold very well and soon most kids who lived in the vicinity of the lolly factory could be seen wandering the streets with a lolly in their mouth. Interestingly enough these kids seemed to loose all interest in playing with the woodblock toys or other kids apart from talking about the latest 'inventions' of the Lolly Makers. The lollies seemed be all they needed. Rumors had it that the lolly makers were using addictive and psychoactive substances in their creations. But whoever uttered any suspicions in this direction soon got letters from a big firm in the city, who advised them to refrain from telling any further lies about the Lolly Makers.

 

Webstandards and self fulfilling Prophecies

Friday, March 07, 2003 21:26 // Aarburg, Switzerland // href

Over the last few days I have redesigned the Website of my Department, and implemented it purely with CSS2. I had to discover the hard way that even 4 years after the standard has been published, we are not there yet. While Mozilla is shining bright with its good implementation, many other entries like Opera, Konqueror and IE are working hard to do a good job but fail in odd places. What is amazing is where they do not work. Opera, for example, can not grok, that a box which is defined by its distance from all edges of the browser window is fully defined an can be displayed propperly. It just ignores part of the settings to be able to draw in the wong place and size.

Why does this not get fixed? First I thought there must be many bugs, and they just don't get round to fix this particular one. But then I got another theory: Everybody who does serious CSS2 Webdesign is working hard to make sure that his pages work with all the players in the CSS arena (Opera, Mozilla, Konqueror and Internet Explorer). Therefore these stupid bugs don't disappear as nobody will notice them but Webdesigners who then successfully hide the bugs from the end users by going the extra mile to make their pages work with all browsers.

How about a website which collects pages that pass all CSS/HTML conformance tests at (validator.w3.org ...) with flying colors, but got axed because display problems in some browser prevented them from working fully 'cross platform'

 

NEWER | LONGER | SHORTER