Why Use FTP
When backing up VMDK files, there are lots of different opinions and methods. The methods vary from using FTP like esXpress does to mounting remote shares on your host server, to using local space.
esXpress also supports SSH as a network transport for backups.
Mounting a NFS or SMB (CIFS) share on the host server
This is not a good thing to do. When you mount remote drive shares on your host you put your host at risk for needing a reboot. If your accessing a remote NFS share and there is a network problem you can cause the process using the NFS share to hang. The only way to end it is to reboot the host.
Suppose the backup software you were using was copying or tar'ing up the VMDK to a NFS share, and your backup server hung or you had a networking issue. The backup process will hang, and hold the lock open on the VMDK file and will probably spin out of control and start eating up console resources. Now when you login to the host and do a df (vdf) your session will hang, because df cannot stat the mounted NFS drive that has disappeared.
Now you have to reboot this host machine to get back to normal. Every time you mount a remote share, you risk this happening.
When pushing backups to another host, you have to have many more ports opened and mapped for NFS or SMB to work.
Using FTP
esXpress uses FTP for backing up the VMDK files from the host servers. FTP is the fastest method of getting data off of your /vmfs filesystem. Achieving network wirespeed is the norm with FTP. If your host console (or backup nic) is a 100Mb connection, then you should be able to get the full 11 megabytes a second of bandwidth (100Mb is about 11MB). If your network is gigabit you can get speeds of 25MB-35MB a second.
10MB a second is 35GB an hour. 20MB a second is 70GB an hour.
Now having multiple host servers sending data at 25m/s across your network is not a great idea. That's why we also compress/encrypt the data before sending it across the network wire.
Using FTP is much easier then mounting shares. Much easier when using through NAT servers.
When we FTP a VMDK file from a host /vmfs, it is FTP'd directly to the backup FTP server.
No helper machines required. Most good windows FTP server software costs around $50.
Using local host /vmfs space
You can use local space 2 different ways.
- You could have a backup drive such as /vmfs/BACKUPS that you copy your backups to.
- You could have just a little /vmfs space to make a tar file, then copy that tar file to another host, through FTP.
Both of these solutions require that your host have a large excess of space.
Using /vmfs space for backups is not the best solution. You are wasting valuable space on your host machines, whether it's local SCSI drives or SAN drives. This space would be better used for running virtual machines.
Plus you want your backup files to be sitting on a fileserver on your network elsewhere, not stored on a /vmfs that is on the same SAN as all your VMDK's.
Backing up through the console from one /vmfs to another /vmfs is not fast.
Saying it's a better thing to do instead of pushing through the network is not true. When you copy a VMDK from one /vmfs to another, you are using twice as much SAN bandwith, you are reading the VMDK, then writing the VMDK. You are going across your fiber framework twice, instead of just once. The more traffic across the SAN network, the slower it gets.
Local Backup /vmfs Example:
Suppose you have 10 host servers on the SAN, and one large SATA rack for backups.
You use a virtual machine as the FTP backup server and all your backup space is a VMDK sitting on a LUN on the SATA rack.
If all 10 hosts start backing up VMDK files and their copy speed is only 7M a second (let's use a small number) we are pretty much overwhelming the SAN fabric, not to mention to SAN itself.
10 hosts reading at 7M/s then writing at 7M/s. We have 140M/s of SAN network traffic, 70m/s reading and 70m/s writing. This is on top of the regular traffic of actually running your virtual machines. Backing up your VMDK files should not be a performance hit on your running virtual machines.
If your backing up to another host on the network (the preferred method is a seperate backup network) you are getting 70M/s of read bandwith off the SAN and each host machine is pushing 7M/s through the network.. better to use the network.
Plus you can have multiple FTP backup servers to reduce network traffic.
If you still choose to backup your VMDK files to a /vmfs/BACKUP partition you are still left with the problem of how to write it to tape. You either need an empty VMware ESX server that does nothing but backup the /vmfs/BACKUP folder or you run it on a another host that is running virtuals. You either use a tape agent or direct attach a tape drive on this VMware ESX host. Doing either of these can cause problems, and may cause you to have to reboot. That why it's advised to use a backup VMware ESX server for this purpose, a host that runs no virtual machines.
Back to the example above, now that you backed up your VMDK files by reading and writing them to the SAN, you then read them all back again to write out to the tape agent or tape drive.
esXpress does not require any local /vmfs space to make the backups (just a little for work files) and it requires only a small amount to restore a DELTA and a FULL backup to a host.
Questions and Answers
Q: We cannot use FTP because it's not secure, we must use secure ftp.
A: You should be backing up to servers in your own data center on a private backup network. Secure FTP is much slower. If encryption is a concern, then use our esXpress Enterprise which includes encryption. Secure FTP coming in production version next month.
Q: How do I use my 'enter name here' backup agent with esXpress?
A: If your host machines are FTP'ing your backups to a Windows Host, then just install the backup agent on that one server and backup the daily files. Same for Linux. You only need 1 license for all your host backups, providing you have one huge backup FTP server.