Monday, December 21, 2009

EON ZFS Storage 0.59.9 based on snv 129, Deduplication release!

Embedded Operating system/Networking (EON), RAM based live ZFS NAS appliance is released on Genunix! This is the first EON release with inline Deduplication features! Many thanks to Genunix.org for download hosting and serving the opensolaris community.

EON Deduplication ZFS storage is available in 32 and 64-bit, CIFS and Samba versions:
tryitEON 64-bit x86 CIFS ISO image version 0.59.9 based on snv_129
tryitEON 64-bit x86 Samba ISO image version 0.59.9 based on snv_129
tryitEON 32-bit x86 CIFS ISO image version 0.59.9 based on snv_129
tryitEON 32-bit x86 Samba ISO image version 0.59.9 based on snv_129

tryitEON 64-bit x86 CIFS ISO image version 0.59.9 based on snv_129 (NO HTTPD)

tryitEON 64-bit x86 Samba ISO image version 0.59.9 based on snv_129 (NO HTTPD)
New/Changes/Fixes:
- Deduplication, Deduplication, Deduplication. (That only used 1x the amount of storage space)
- The hotplug errors at boot are being worked on. They are safe to ignore.
- Cleaned up minor entries in /mnt/eon0/.exec. Added "rsync --daemon" to start by default.
- EON rebooting at grub(since snv_122) in ESXi, Fusion and various versions of VMware workstation. This is related to bug 6820576. Workaround, at grub press e and add on the end of the kernel line "-B disable-pcieb=true"

40 comments:

dimsoft said...

great news.
I will try to RAID intel srcs16

Peter J. Lu said...

Awesome!

A quick question: is CIFS or SMB recommended for access from WinXP and Win7?

dimsoft said...

i try add driver lsimega
/boot/solaris/bin/root_archive pack x86-new.eon /tmp/dd

error:
cd: ./tmp/root: [No such file or directory]

Andre Lue said...

Peter,

I would recommend starting with CIFS. I have tested with Win XP, 2000, OSX but not Win 7. I don't think there should be any problems with Win 7.

dimsoft,

After verifying the driver works you would add it to /mnt/eon0/.backup and run updimg.sh to permanently preserve it. See these posts 1, 2 as refreshers. It looks like you attempted method 1 in the first link, which is done on another working opensolaris system, not EON.

dimsoft said...

that is what I was guided by
1) unpack correctly
2) add driver
3) Pack - get an error

I used Open solaris 2009.6

and go to the folder / tmp / root, I also could not - no rights

Andre Lue said...

dimsoft,

Are you user root? Try adding pfexec in front of your command.

dimsoft said...

i use command
su -
and input password

or i need login at user root ?

dimsoft said...

i used pfexec - error again

Andre Lue said...

dimsoft,

Can you kindly start a thread in opensolaris discuss and post all your commands from unpack to error?

Is /tmp/dd the mounted uncompressed image or a directory with the contents copied?

dimsoft said...

my actions:
1) mkdir / tmp / dd
2) / boot / solaris / bin / root_archive unpack x86.eon / tmp / dd
3) cp / kernel / drv / lsimega / tmp / dd / kernel / drv
4) cp / kernel / drv / lsimega.conf / tmp / dd / kernel / drv
5) cp / kernel/drv/amd64/lsimega / tmp/dd/kernel/drv/amd64
6) add_drv-b / tmp / dd-n-v / kernel/drv/amd64/lsimega
7) / boot / solaris / bin / root_archive pack x86-new.eon / tmp / dd

Andre Lue said...

dimsoft,

Looks like you did everything correct. Can you post the exact error(s).

Can you make a copy of the failsafe compressed image that comes with 2009.6, repeat the steps and report if you get the same error?

Peter J. Lu said...

Is anyone else seeing long pauses in file copy operations between machines running SNV_129 and EON? I've got a Core i7 2.8 GHz proc in my main machine, with 10 GB of RAM. My EON box is an older Core2 (two Core2 duo procs at 2.66 GHz) with 4 GB of RAM, running the most recent EON.

In all cases, when I copy, it works great for a few seconds, often hitting the ethernet bandwidth limit of 120 MB/sec (very impressive to watch in the System Monitor). Then the transfer rate goes to zero and remains there for 10 seconds, then the copy starts again. The process repeats (copy at high rate for a few seconds; pause with no file transfer activity for a few seconds). I have no idea why this is happening. The file I'm copying is a single 3 GB image data file (dragging and dropping in gnome/nautilus).

My main zpools are mirrors of two Hitachi 2 TB SATA drives, attached to the motherboard's SAS controller. This is the case on both machines (i.e. two drives mirrored in the SNV129 machine, two drives mirrored on the EON box).

I find the same behavior, whether or not dedup and/or compression is enabled (it's arguably slightly faster with compression/dedup, actually). And I've also used an SLC SSD (which is the rpool on my SNV 129 machine). In all cases, the copying is brilliant for a few seconds, then comes to a full stop for a few more, and then repeats.

I've just used the standard EON install, with the drive mapped using NFS sharing in ZFS; the instructions on enabling NFS server according to the FAQ, enabling NFS client on my SNV 129 machine, and then the standard mount -F nfs command.

So it's about as simple a setup as I can make it, under the circumstances. Nothing should be hardware limited, either, as far as I can tell. So I'm stumped!

Does anyone else see this kind of behavior? Thanks for any advice, and Merry Christmas

dimsoft said...

Can you make a copy of the failsafe compressed image that comes with 2009.6


a path and a file name ?

Andre Lue said...

Peter,

Please try the following:
-open 2 terminals on the EON machine.
-In my case bge0 is the name of interface. Substitute your interface name. This will show network traffic.
dladm show-link bge0 -s -i 1

-In the other terminal run
zpool iostat -v 1
-Then start your 3Gb copy and observe network and zpool traffic. See if anything glaring jumps out. Feel free to start a opensolaris thread and post the output and I'll take a look. Also observe/post /var/adm/message if any messages fly during the test.
this can be done with a 3rd terminal and,
tail -f /var/adm/messages
before you start the 3Gb copy.

Andre Lue said...

Peter,

Also, can you tell if this happens if you copy the file locally between 2 zfs filesystems?

jimklimov said...

Peter: the behavior you describe seems similar to ZFS transaction closing (see OpenSolaris Forums discussions earlier in the year).

The salt of that problem is (roughly) that your ZFS writes are cached into RAM first, then either as the TXG Sync timeout expires or as your ZFS RAM depletes, the ZFS subsystem blocks until it swaps all the writes onto disk. (In my case, most system activity was also blocked as some write was impending somewhere, like a syslog update or whatever)

This was tuned in recent distro builds, there was some more logic added to make the systems more responsive in these times. One working suggestion was to lower the timeouts (or MB count) at which the ZFS write-to-disk fires.

This theoretically lowers the file server performance somewhat (on average) but makes it much more usable during bulk writes (especially if it's also an interactive system).

HTH,
//Jim

curry-col said...

Great work!

I wanted to ask, how can I add more binaries to "local"? I need to have "your" build environment, right? I think static binary from opensolaris would work, too, but I don't have any other opensolaris system at the moment.

I wanted to ask you to add two more binaries to your bin distribution.

First one is GNU screen, because it's incredibly useful for running long-running commands on your nas box.

The other one is aria2, it's multiprotocol download client in just one CLI binary -- because transmission's lack of pre-reservation and resulting transfer rates of downloaded files are just annoying.

But anyway, thanks for your work :)

(also, FAQ entry about child mounts in CIFS and guest access would be nice. I can do that myself, if you want)

Andre Lue said...

Peter,

I believe you are hitting the ZFS write throttle with TXGs getting delayed.

Can you kindly add a zpool status -v when you post a thread to understand how your pool was built.

Jimklimov,

Kindly correct me if this not the same issue?

Andre Lue said...

curry-col,

See the binary kit section in downloads. You would add new package names to bin-pkg.list and run binpkg.sh. This requires a Full opensolaris system and the snv_xxx DVD release, you wish to build.

I will take a look at compiling aria.

Andre Lue said...

curry-col,

Feel free to email the FAQ additions to eonstore AT gmail dot com.

dmitry.sorokin said...

Is it possible to include SUNWbridger and SUNWbridgeu packages in the next EON release? These are for crossbow bridging support (all dladm *-bridge commands).

Thanks,
Dmitry

Guido said...

Andre,

thanks for the 129er release.

Whats the preferred way to have

smbadm join -w xyz

persistent ?

I've added it into .exec (and have .exec in .backup).

How are other's doing it ?

Andre Lue said...

Guido,

If you added:
/usr/sbin/smbadm join -w xyz to /mnt/eon0/.exec
/var/smb/smbpasswd to /mnt/eon0/.backup

That is correct.

tk said...

When booting from CD I receive a "no relocation information found for module /platform/i86pc/kernel/amd64/unix" error.

Andre Lue said...

tk,

That error usually means you do not have a bootable kernel, which most likely means you either have
- corrupt downloaded image
- or an incorrectly burned image
- or burned and booted a 64 bit iso on 32-bit capable hardware (rare fail)

unicron said...

today i tried EON for the first time, because I would love to use ZFS on solaris.

When EON is almost booted I get the message: smbd[615] dyndns failed to get dommainname

I have a realtek network card that is detected (looks likes rge driver is now by default in EON) and use DHCP to get an IP address.

I also tried to use the ping command, e.g ping www.google.com. But eon cannot resolve the address. Is this because there is no DNS server available/configured?

Second question is if there is another editor on EON besides 'vi'?

unicron said...

after running the updimg.sh script the message has disappeared. So probably i first had to run the updimg script once

Andre Lue said...

unicron,

smbd[615] dyndns failed to get dommainname

Appears because not enough is yet configured. After configuring
/etc/resolv.conf
cp /etc/nsswitch.dns /etc/nsswithc.conf
the message should go away. It is mostly an informational message.

Vi is currently the only editor.

Peter J. Lu said...

Andre:

I have run the tests you mentioned, but I didn't see anything particularly illuminating to me:

eon:1:~#zpool status -v
pool: plu_backup_data
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
plu_backup_data ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c0t0d0 ONLINE 0 0 0
c0t1d0 ONLINE 0 0 0

errors: No known data errors

eon:3:~#dladm show-link bge0 -s -i 1
LINK IPACKETS RBYTES IERRORS OPACKETS OBYTES OERRORS
bge0 22379149 3273841816 0 12288483 228555376 0
bge0 26 6460 0 29 6342 0
bge0 26401 39517266 0 14241 1099084 0
bge0 29548 43973788 0 15816 1220946 0
bge0 75482 112993998 0 40672 3129300 0

eon:4:~#zpool iostat -v 1
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
plu_backup_data 42.5G 1.77T 1 806 2.70K 101M
mirror 42.5G 1.77T 1 806 2.70K 101M
c0t0d0 - - 0 869 1.20K 109M
c0t1d0 - - 0 806 1.50K 101M
---------- ----- ----- ----- ----- ----- -----

eon:4:~#zpool iostat -v 1
eon:5:~#tail -f /var/adm/messages
Feb 3 00:21:01 eon pseudo: [ID 129642 kern.info] pseudo-device: profile0
Feb 3 00:21:01 eon genunix: [ID 936769 kern.info] profile0 is /pseudo/profile@0
Feb 3 00:21:01 eon pseudo: [ID 129642 kern.info] pseudo-device: sdt0
Feb 3 00:21:01 eon genunix: [ID 936769 kern.info] sdt0 is /pseudo/sdt@0
Feb 3 00:21:01 eon pseudo: [ID 129642 kern.info] pseudo-device: systrace0
Feb 3 00:21:01 eon genunix: [ID 936769 kern.info] systrace0 is /pseudo/systrace@0
Feb 3 00:30:27 eon ntpd[1324]: [ID 702911 daemon.notice] frequency error -512 PPM exceeds tolerance 500 PPM
Feb 3 00:31:32 eon last message repeated 1 time
Feb 3 00:32:36 eon ntpd[1324]: [ID 702911 daemon.notice] frequency error -512 PPM exceeds tolerance 500 PPM
Feb 3 00:33:40 eon last message repeated 1 time

Andre Lue said...

Peter,

Based on your 115Mb/s network and 101Mb/s disk numbers, I'd say the behavior seen is expected. You have a write bandwidth capability of 1 disk but a read capability of 2 disks by your pool design.

As a possible performance increase you can try setting the transaction limit lower so that it doesn't have to flush as many transactions to disk at once. You can try that by adding to /etc/system (default=30)
set zfs:zfs_txg_timeout=5 or 1

Another possibility may include redesigning your pool to get more disks and bandwidth per pool.

You can also try playing with setting zpool property logbias=throughput vs latency in combination with zfs_txg_timeout changes.

erie said...

I have been using EON since the 129 release. I started using it as a CIFS server to a windows server 2008 R2 system. Which has worked very well. I set it up with a Zpool of 6 .5TB disks in radiz. I have 1 directory shared via CIFS. I have copied windows directories with multiple levels and multiple 5GB files (VHDs)without a problem over the 1GB LAN. I recently added 2 ISCSI volumes so I can mount VHDs in them. The problem is that it is impossible to copy any 40GB VHD without a disconnection to a LUN. I observe only the LUN being written becomes disconnected. It seems 129 requires comstar for ISCSI. Has anyone else had a problem like this? Can the problem be avoided by going back to the old ISCSI target?

Andre Lue said...

erie,

The switch to COMSTAR was made mostly because of performance reasons and iscsitgtd is no longer supported.

I suggest starting a detailed opensolaris.org thread with as much info as possible on your setup and the disconnect. Also post a link back to the thread.

Is there any alarming info in /var/adm/messages when the disconnect occurs? Does the copy disconnect the LUN at the same point, always?

PhoenixUA said...

What about new release of EON based of snv_132?
Some Comstar bugs were fixed: 6914809 and 6910810

erie said...

andre;
I saw your recommendation to try
"set zfs:zfs_txg_timeout=5".
I have 8GB memory and am seeming long write periods at high IO and CPU usage. 2 questions
1) when should I see the affects of the change.
2)What do I need to do to get the settings in /etc/system to be remembered across a reboot?

Andre Lue said...

PhoenixUA,

The next release of EON will be based on snv_130 as EON is based on SXCE, which is EOL'd. Opensolaris is still being researched along with trying to get IPS packages to conform.

erie,

Add the changes to /etc/system and add this entry to /mnt/eon0/.backup. Then run updimg.sh. This is the method to preserve files through an update. The changes should be in effect at next boot. If mdb was available it would be almost instantaneous. Before changing things I recommend viewing the network and disk utilization. Make sure you are not already at the limits of either before trying tune in the wrong direction.

unicron said...

Hi,

Im having a problem with the sharesmb property of zfs. I'm trying to use host based access rules. I have two machines which have a fixed IP-address. Using the sharesmb property I'm trying to get it in such a way that one machine is allowed to read and write and the other machines only read.

According to documentation I should be able to use IP-addresses but they don't seem to work.

For example i tried:
zfs set sharesmb=rw=@192.168.1.2,ro=* mypool
zfs set sharesmb=rw=@192.168.1.2/32,ro=* mypool

The results are that machine 192.168.1.2 has read only access. Why is it ignoring the 'rw' part?

Am i misinterpreting the documentation?

unicron said...

Is it still the case that pools are not mounted at startup?

I don't like the uncommenting of zpool import in .exec because it fills up the history of the pool

Andre Lue said...

unicron,

A client host is permitted to have only one of the following types of access to a share:
* Read-only access
* Read-write access
* No access

Yes, the zpool import is still need.

Rainer said...

Hi,
I started yesterday to implement a EON NFS and have some problems with the current version.
I copied already some Gigabytes over during the night, then there was a reboot and now the zpool seems to be gone.
This runs on a XEN host with the image on a .img file and the storage disk as a real disk.
Please help...
... and where can we send bugs to?

Andre Lue said...

Rainer,

You can start a thread using the Discuss link in the upper right of the blog.

Please provide more details like
-what kind of zpool
-what is meant by "then there was a reboot"
-does the image still boot and what happens when you try to import the pool?
the more info you rpovide the better