Openstack Multi-Node with Single NIC

March 20, 2015 Leave a comment

This post is a summary of my experience getting a multi-node, single nic setup up and running.  I have two older laptops (relative term, they still are x64 with intel-vt so not that old) that I use in this scenario.  One is a levono (called stinkpad) and the other is a dell (called dell).  The purpose of all this is to enable neutron networking on a juno ubuntu openstack setup.

Let’s define what I’m doing here.

My switch/router is an Linksys N3000 which is running dd-wrt kong.  Basically it allows me to utilize VLANs. I actaully have it on the same LAN segment as my Comcast Business Router (which has awesome wifi as a side note).  This subnet is the 10.1.10.0/24 with the comcast router at .1 and the dd-wrt at .2.  It is important to note I am NOT NATing between these, .1 does the DHCP and routing. I am pluggin my laptops into the dd-wrt directly (I can’t do VLANs on the comcast router).  Why am I complicated? No clue, it’s just the setup that was there when I got this working.  (see here for someone who blogged about this a bit in depth, I would caution you should use ssh and command line for the VLANs, the GUI is not 100% at all!).

Just a side note, I wasted a ton of time trying to use wlan0. Not only does the centrino chip on the stinkpad have issues, but it will never work.  The wifi interface on the router doesn’t play well with masqueraded or changed packets, nor will it do VLAN trunking they way I want it.

Stepping back, we are going to have 3 networks but you only need two.  One is an external internet one that is my LAN 10.1.10/0/24 which all computers can reach behind my router.  This network will be the network with the floating ip range allocated and the “public” or “external” network in openstack. If you are using all yous IPs, you can have the dd-wrt NAT from the comcast router on a new subnet, however I didn’t feel like figuring out the translation from a computer on the Comcast subnet to the dd-wrt one.  I choose to allocated DHCP on 100-200 so I used the 10.1.10.10 to the 10.1.10.60 range for openstack (giving me 49 IPs (the router will use the first one)).

Utilizing the wiki for dd-wrt I configured two vlans (really just needed one but why stop at one?).  I used VLAN 5 and VLAN 6.  VLAN 5 is the 10.5.0.0/24 subnet and VLAN 6 is the 10.6.0.0/24 subnet.  They are trunked to the ports and my configuration for my dd-wrt is below.

#===========================================================#
# DD-WRT V24-K26 #
# Kong Mod #
#===========================================================#

BusyBox v1.21.0 (2014-06-07 21:53:22 CEST) built-in shell (ash)
Enter 'help' for a list of built-in commands.

root@DD-WRT:~# nvram show | grep vlan.*ports
size: 27882 bytes (33558 left)
vlan6ports=4t 3t 2t 1t 8
vlan2ports=0 8
new_vlan2ports=0 8
vlan5ports=4t 3t 2t 1t 8
vlan1ports=4 3 2 1 8*
new_vlan1ports=1 2 3 4 8*
root@DD-WRT:~# nvram show | grep port.*vlans
port5vlans=1
port3vlans=1
port1vlans=1
port4vlans=1
port2vlans=1
port0vlans=2
size: 27882 bytes (33558 left)
root@DD-WRT:~# nvram show | grep vlan.*hwname
new_vlan1hwname=et0
vlan6hwname=et0
vlan2hwname=et0
vlan5hwname=et0
vlan1hwname=et0
new_vlan2hwname=et0
size: 27882 bytes (33558 left)

As for the servers, I used the debian wiki and Steve Weston’s blog for help here.

configure the eth0 for stinkpad (my controller/compute/network node) with a static IP

# interfaces(5) file used by ifup(8) and ifdown(8)

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 10.1.10.165
netmask 255.255.255.0
gateway 10.1.10.1
dns-nameservers 10.1.10.1 75.75.75.75

and for the dell

# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 10.1.10.223
netmask 255.255.255.0
gateway 10.1.10.1
dns-nameservers 10.1.10.1 75.75.75.75

I reboot here (good time to do an apt-get update && apt-get dist-upgrade; and make sure everything came back up

After reboot add the following for packet forwarding

net.ipv4.ip_forward=1
net.ipv4.conf.all.rp_filter=0
net.ipv4.conf.default.rp_filter=0

To load the values run the following

sudo sysctl -p

Install the following packages

sudo apt-get -y install ntp vlan openvswitch-switch

load the modules

sudo modprobe 8021q

sudo sh -c ‘echo 8021q >> /etc/modules’

Now create the vlans on both servers and assign an IP (these .2 are for the stinkpad, the .4 is for the dell)

sudo vconfig add eth0 5
sudo ip addr add 10.5.0.2/24 dev eth0.5
sudo ip link set dev eth0.5 up

sudo vconfig add eth 6
sudo ip addr add 10.6.0.2/24 dev eth0.6
sudo ip link set dev eth0.6

Do the same for the dell

sudo vconfig add eth0 5
sudo ip addr add 10.5.0.4/24 dev eth0.5
sudo ip link set dev eth0.5 up

sudo vconfig add eth 6
sudo ip addr add 10.6.0.4/24 dev eth0.6
sudo ip link set dev eth0.6

Make sure the interfaces are working and you can ping from one another

Let’s add them to the /etc/network/interfaces,  – this is for dell but do the same for the other node

auto eth0.5
iface eth0.5 inet static
vlan-raw-device eth0
address 10.5.0.4
netmask 255.255.255.0

auto eth0.6
iface eth0.6 inet static
vlan-raw-device eth0
address 10.6.0.4
netmask 255.255.255.0

Now at this point I am going to create an OVS bridge for my primary interface. If you aren’t on the console or using one of the VLAN ports to ssh then you’ll want to change over since this will disconnect your current session.

First let’s create the bridge, assign the port and assign the IP (this is for dell but you obviously need to do this on both nodes)

sudo ovs-vsctl add-br br0
sudo ovs-vsctl add-port br0 eth0 #or whatever your interface is called)

sudo ifconfig br0 10.1.10.223 netmask 255.255.255.0
sudo route add default gw 10.1.10.1
sudo ifconfig eth0 0

Double check here and make sure the IP is working. Then let’s commit this to the interfaces and reboot to ensure it sticks. This is the entire interfaces for the dell node, note that it replaces what is there and includes the vlan info (which is not in a ovs bridge since we will let neutron do this through the ml2 plugin.

# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback

auto eth0
allow-br0 eth0
iface eth0 inet manual
ovs_bridge br0
ovs_type OVSPort
pre-up ifconfig $IFACE up
post-down ifconfig $IFACE down
address 0.0.0.0

auto br0
allow-ovs br0
iface br0 inet static
address 10.1.10.223
netmask 255.255.255.0
gateway 10.1.10.1
dns-nameservers 10.1.10.1 75.75.75.75
ovs_type OVSBridge
ovs_ports eth0
pre-up ifconfig $IFACE up
post-down ifconfig $IFACE down

auto eth0.5
iface eth0.5 inet static
vlan-raw-device eth0
address 10.5.0.4
netmask 255.255.255.0

auto eth0.6
iface eth0.6 inet static
vlan-raw-device eth0
address 10.6.0.4
netmask 255.255.255.0

Now let’s make sure everything is working by crossing our fingers and rebooting.  Especially with Ubuntu you’ll likely get a bit of a delay and then we should have the interfaces up. Log back in and let’s see what the final product looks like.

stack@dell:~$ ifconfig
br0 Link encap:Ethernet HWaddr 00:22:19:e2:e5:de
inet addr:10.1.10.223 Bcast:10.1.10.255 Mask:255.255.255.0
inet6 addr: 2601:0:9182:b000:f4ac:61c9:ef87:d6ae/64 Scope:Global
inet6 addr: fe80::4006:22ff:fee7:c09b/64 Scope:Link
inet6 addr: 2601:0:9182:b000:222:19ff:fee2:e5de/64 Scope:Global
UP BROADCAST RUNNING MTU:1500 Metric:1
RX packets:362029 errors:0 dropped:60192 overruns:0 frame:0
TX packets:32782 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:98079943 (98.0 MB) TX bytes:19888165 (19.8 MB)

eth0 Link encap:Ethernet HWaddr 00:22:19:e2:e5:de
inet6 addr: fe80::222:19ff:fee2:e5de/64 Scope:Link
inet6 addr: 2601:0:9182:b000:4ac:2aa5:9f09:e7a6/64 Scope:Global
inet6 addr: 2601:0:9182:b000:222:19ff:fee2:e5de/64 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:744185 errors:0 dropped:0 overruns:0 frame:0
TX packets:351772 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:421597281 (421.5 MB) TX bytes:72956467 (72.9 MB)
Interrupt:17

eth0.5 Link encap:Ethernet HWaddr 00:22:19:e2:e5:de
inet addr:10.5.0.4 Bcast:10.5.0.255 Mask:255.255.255.0
inet6 addr: fe80::222:19ff:fee2:e5de/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:226093 errors:0 dropped:0 overruns:0 frame:0
TX packets:279632 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:305497247 (305.4 MB) TX bytes:48168955 (48.1 MB)


eth0.6 Link encap:Ethernet HWaddr 00:22:19:e2:e5:de
inet addr:10.6.0.4 Bcast:10.6.0.255 Mask:255.255.255.0
inet6 addr: fe80::222:19ff:fee2:e5de/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:257 errors:0 dropped:0 overruns:0 frame:0
TX packets:292 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:28742 (28.7 KB) TX bytes:37008 (37.0 KB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:69856 errors:0 dropped:0 overruns:0 frame:0
TX packets:69856 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6204825 (6.2 MB) TX bytes:6204825 (6.2 MB)

stack@dell:~$ sudo ovs-vsctl show
722d1b65-7e9a-4cdc-a8b2-8a71448a507d
Bridge "br0"
Port "eth0"
Interface "eth0"
Port "br0"
Interface "br0"
type: internal
ovs_version: "2.0.2"

Make sure you modify /etc/hosts

for dell:

127.0.0.1 localhost
10.1.10.223 dell
10.5.0.4 compute1
10.1.10.165 stinkpad
10.5.0.2 controller network database compute0

for stinkpad:

127.0.0.1 localhost
10.1.10.165 stinkpad
10.5.0.2 controller database network compute0
10.1.10.223 dell
10.5.0.4 compute1

At this point you can install openstack or devstack

Again remembering my public network is 10.1.10.0/24 and I want to use .10-.60 for my floating ips
10.5.0.0/24 is my management
10.6.0.0/24 is my tunneling interface

here is my l3_agent.ini

[DEFAULT]
interface_driver = neutron.agent.linux.interface.OVSInterfaceDriver
use_namespaces = True
external_network_bridge = br0
router_delete_namespaces = True

/etc/neutron/plugins/ml2/ml2_conf.ini

[ml2]
type_drivers = flat,gre
tenant_network_types = gre
mechanism_drivers = openvswitch

[ml2_type_flat]
Example:flat_networks = external

[ml2_type_vlan]

[ml2_type_gre]
tunnel_id_ranges = 1:1000

[ml2_type_vxlan]

[securitygroup]
enable_security_group = True
enable_ipset = True
firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

[ovs]
local_ip = 10.6.0.2
enable_tunneling = True
bridge_mappings = external:br0

[agent]
tunnel_types = gre

We use gre to create tunnels on the 10.6.0.0 network between the nodes so the tenant (within the tenant) can talk to each other.

If you want to experience migration of the VMs on KVM, don’t forget to enable the ssh key for the nova account (you’ll need to set it to /bin/bash and actually create and import the key for passwordless auth between node)

Hope this helps someone, if you’re on a single machine, you can create vlans because they won’t go anywhere. If you can get switch VLANs down, this is actually pretty simple compared to trying to fake bridge and do some masquerading to get this to work.

Categories: Cloud, OpenStack Tags: , ,

Using Swift backend for Glance on Ubuntu Openstack

March 20, 2015 Leave a comment

This took a bit for me to figure out with Juno so I thought I’d share if you’re interested.

If you use devstack, this happens pretty naturally.  I like to understand things a bit more fully and have been playing around with two old laptops running ubuntu 14.04 and openstack (one is a controller and compute/network, the other compute/network).

I only have one node running glance but both run swift.  Obviously your swift should be up and running. Openstack’s Ubuntu guide does this pretty well.  If you want to setup something a bit more fancy, you can also use the Ubuntu SecurityTeam TestingOpenStack guide (it’s a little dated but still good).

Once you have both swift and glance up, edit the /etc/glance/glance-api.conf

[DEFAULT]

default_store = swift #I don’t think this is needed but doesn’t hurt

[keystone_authtoken]
auth_uri = http://controller:5000/v2.0
identity_uri = http://controller:35357
admin_tenant_name = service
admin_user = glance
admin_password = glance_pass
revocation_cache_time = 10

[glance_store]
stores = glance.store.swift.Store
#filesystem_store_datadir = /var/lib/glance/images/
swift_store_auth_version = 2
swift_store_auth_address = http://controller:35357/v2.0/
swift_store_user = service:glance
swift_store_key = glance_pass
swift_store_create_container_on_put = True
swift_store_large_object_size = 5120
swift_store_large_object_chunk_size = 200
swift_enable_snet = False

Notice the auth_address is 35357 for the admin keystone portal? Also note we commented out the filesystem_store_datadir

Also you’ll want to use the glance keystone account and tenant in the store_user section with the password (for the keystone glance account) in the store_key field.

The biggest thing I missed was the stores = glance.store.swift.Store field.  It’s new and not mentioned anywhere I could find except the commented config file.

Go ahead and modify the /etc/glance/glance-registry.conf

[keystone_authtoken]
auth_uri = http://controller:5000/v2.0
identity_uri = http://controller:35357
admin_tenant_name = service
admin_user = glance
admin_password = glance_pass

and then modify the /etc/glance/glance-cache.conf (I don’t think this is needed but just in case)

swift_store_auth_version = 2
swift_store_auth_address = http://controller:35357/v2.0/
swift_store_user = service:glance
swift_store_key = glance_pass
swift_store_container = glance
swift_store_create_container_on_put = True
swift_store_large_object_size = 5120
swift_store_large_object_chunk_size = 200
swift_enable_snet = False

the glance container should get created automatically since we specified the create_container_on_put command

sudo service glance-api restart
sudo service glance-registry restart

go ahead and create an image and upload it to glance

mkdir -p /tmp/images

. ~/admin-openrc.sh #source the admin creds

wget -P /tmp/images https://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-disk1.img
glance image-create --name ubuntu-trusty-x86_64 --disk-format=qcow2 --container-format=bare --file /tmp/images/trusty-server-cloudimg-amd64-disk1.img --progress

glance image-list

we should see the image returned

stack@dell:~$ . admin-openrc.sh
stack@dell:~$ glance image-list
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| ID | Name | Disk Format | Container Format | Size | Status |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| db3d6070-0acb-4053-8c9d-277eb090496e | cirros-0.3.3-x86_64 | qcow2 | bare | 13200896 | active |
| 1e7b430c-4aee-43d7-a96e-8a6cab394959 | ubuntu-trusty-x86_64 | qcow2 | bare | 256639488 | active |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
stack@dell:~$

Let’s make sure it went into swift though

source the glance user or just export the username/password/tenant and then look in swift

stack@dell:~$ export OS_USERNAME="glance"
stack@dell:~$ export OS_PASSWORD="glance_pass"
stack@dell:~$ export OS_TENANT_NAME="service"
stack@dell:~$ swift list
glance
stack@dell:~$ swift list glance
1e7b430c-4aee-43d7-a96e-8a6cab394959

And that’s it, you’ve now configured swift as your backend.  I’d encourage you to look into using Ceph but I’m holding out until I get another crappy laptop to make 3 nodes (I only have 2). Donations are always welcome!

Categories: Cloud, OpenStack Tags: , , ,

SteamOS on Lenovo T510 with Nvidia Optimus

February 13, 2015 Leave a comment

SteamOS is still in beta but releases come out and I found the concept interesting.  However, you need a PC with UEFI BIOS and a newer Nvidia or ATI card (although I think an Intel may work).  SteamOS is basically meant to run a Home Theater PC in your livingroom and/or be a gaming console with a controller (Xbox controller can work here if you have the right model/driver).  I was more curious than anything but it wasn’t easy to get this running on the T510.  This is just to share what it took and to bookmark a place for me to come back to (since I have other purposes for this computer, namely openstack testing with Mirantis).

The Lenovo T510 I used has an Nvidia NVS 3100T card and the i5 processor.  Lenovo, and many other manufacturers, did not add UEFI until 2012 (meaning that it is in the T520 but not the T510).  Furthermore, the Nvidia Linux driver dropped support for many older cards (like the 3100T) after the 340.x driver version (I think it’s on 343 right now?)  Steam builds the driver in their Debian-based installer and if dmesg shows NVIDIA, it will assume you want the new drivier.  The issue is, automation breaks because it will prompt about an unsupported driver, so the automated install always will fail.  In any case, here’s how you do this.

WARNING: This wipes out all data on the drive!!!

  1. Get the ISO version of the install, you can do this yourself or get a precompiled copy.  It’s just easier to make a DVD here (because USB boot is based on UEFI and you have to fake the CDROM being the USB…trust me, just burn a DVD for this).
    1. Get the ISO by going the the steam forum and checking for a new sticky on a release and burn a DVD – http://steamcommunity.com/groups/steamuniverse/discussions/1/
  2. Boot up the laptop with DVD and let it install (you can select expert).  Midway through the core install it will error and have a message about trying it 5 times again, You need to just click back until it gives you an option for continue, putting you back on a menu screen.  At this point we jump to tty2 by pressing CTRL-ALT-F2 and perform the following commands:
    1. chroot /target /bin/bash
    2. apt-get install -f
      1. If you get the nvidia installer ignore the next step
    3. apt-get install nvidia-*
      1. When the installer asks to install an unsupported card, say yes, you’ll get two more installers ask questions, select the obvious choices
  3. Hit CTRL-ALT-F5 (I think) to go back to the installer and select install core components (or the selection it should already be on). It will warn of a dirty install, say it’s ok and wait. It should complete the installation (maybe give a warning but it’s fine)
  4. It should boot into a gnome desktop where you need to enable the network, it will update steam and reboot.
  5. It will boot to the GRUB menu and should backup the system partition (select this if it doesn’t do it automatically). upon the next reboot, hit ESC after the lenovo screen (be quick) to get the GRUB menu to show.  Select the (recovery mode option) and it should boot to command line.
  6. At boot, login with desktop/desktop and we will be following most of this guide. but we need to have network enabled first and this is not up on recovery (at least correctly).
    1. login if you’re not already with desktop/desktop
      1.  you may need to set the password using the “passwd” command after login (make it: desktop)
      2. change the Xorg to GDM3 by running the following
        1. sudo dpkg-reconfigure lightdm
          1. select gdm3
        2. sudo passwd steam
          1. set it to “steam”
        3. sudo reboot
    2. Hit ESC after the Lenovo splash screen, keep trying until you see a legacy graphic backsplash of SteamOS and the Grub menu should appear after a few seconds
      1. Select the normal boot option this time and let it boot into the gnome desktop
      2. once you’re in (you may have to select STEAMOS for login), hit CTL-ALT-F2 for the command line
      3. Logon with desktop/desktop and proceed to step C
    3. install compilers for headers
      1. sudo apt-get install build-essential linux-headers-$(uname -r)
      2. wget http://us.download.nvidia.com/XFree86/Linux-x86_64/340.76/NVIDIA-Linux-x86_64-340.76.run
      3. sudo chmod +x NVIDIA-Linux-x86_64-340.76.run
      4. sudo apt-get –purge remove xserver-xorg-video-nouveau nvidia-kernel-common nvidia-kernel-dkms nvidia-glx nvidia-smi
      5. sudo apt-get remove –purge nvidia-*
      6. sudo nano /etc/modprobe.d/disable-nouveau.conf
        1. # Disable nouveau
          blacklist nouveau
          options nouveau modeset=0
        2. add the lines above and save this new file (CTRL-X then Y then ENTER)
      7. sudo dpkg-reconfigure lightdm
        1. select lightdm
      8. sudo reboot
    4. Hit ESC after the Lenovo splash screen (we’re going back to recovery) and select the recovery option in Grub (should be on bash already)
      1. cd /home/desktop (I think, might be steamos)
      2. ls -la (see if the NVIDIA file is there, if not, find it)
      3. sudo /etc/init.d/lightdm stop
      4. sudo /etc/init.d/gdm3 stop
      5. sudo ./NVIDIA-Linux-x86_64-340.76.run
        1. ACCEPT EULA, say YES to DKMS, YES to 32bit compat, YES to Xorg config and click OK
      6. sudo reboot
    5. Boot to normal mode and wait (you may have a big cursor and be waiting for over 10 minutes).  I had dinner, came back and I was up and running

Hope this works for some of you who want to test it!

This problem has been raised so hopefully it will be addressed in the final release

https://github.com/ValveSoftware/SteamOS/issues/163

http://paste.ubuntu.com/7972356/

PS – I noticed a few other interesting places discussing this which I haven’t tried

1) http://steamcommunity.com/groups/steamuniverse/discussions/1/648814395823405268/

2) GitHub for non-UEFI boot (no legacy nvidia yet) – https://github.com/directhex/steamos-installer

Categories: Hardware, Linux Tags: , ,

Lightbulb moment with Docker

February 5, 2015 2 comments

I’ve heard the word, played with Docker 101 and still was left wondering why this technology is important.  After all, we have VMs we can run applications on and they obviously support a lot more apps than Docker does.  Something drew me back though to actually spin up my own machine with a real Docker Engine and get my hands dirty.

Just a recap to explain the difference between containers and VMs.  VMs are full Guest OS running their own libraries and executable, etc.  They should be pretty familiar concepts to most people at this point.  You can use orchestration and provisioning to stand up the OS, Hypervisor, guest OS, and applications.  Normally, many of us on the Tech Ops side of the house are used to doing it the old fashioned way, we have our shortcuts but we’re still touching or tweaking a few things.  Those of us on larger deployments lean much more towards automation, but it’s often overkill for smaller deployments. Let’s place ourselves in the automated camp and say we have templates for the OS, we can provision on bare metal and even push and configure applications.

Containers, on the other hand, sit on top of an engine (Docker, in this case). You have a base OS and then Docker.  Docker isn’t a hypervisor, it does not map hardware virtually as a standard OS does.  It will take applications (and an OS) and allow them to share resources (mainly libraries and bins) with some, all or isolated from each other.  The biggest thing to keep in mind here is persistence.  When Docker updates a container, it doesn’t apply a patch or update, it rebuilds the entire container.  Sure, you can modify in a container, but that isn’t docker, that’s you doing this.  When you want to update or improve, you’ll lose your changes.  What you ought to do is use the Dockerfile to add the applications and components you need.

First thought I had, is how the heck are you supposed to use a database then?  Can’t really afford to wipe that data. The answer is (and pardon the poor terminology) is mapping a container to a volume on the host (can the host see it? Good, you can have a container hit it). Hope that makes sense, so keeping this in mind, I still struggled to find why Docker matters.

In a past life, I spent over a year on a development team (I had little business being there) but techops didn’t like me because I kept thinking like a developer.  So before DevOps was coined, I was a TechOps person planted into Development and it was great.  I learned a lot by managing deployments and watching the process of SVN and code push.

Now, developers usually checkin code they develop on their machine.  Code is supposed to be merged and then run in a test environment (if one exists).  Then it’s deployed in Production (I’m simplifying this).  Most modern methods of Agile development involve streamlining the code deployment (so developer 1 isn’t waiting for developer 2 to finish so they can work).  Also that code should be complete and automated testing should be used.

I know the big problem coming from devops. It’s the environment.  While some shops can afford multiple copies of Prod, they always lack things.  Let’s say we isolate DEV and copy it from PROD.  It’s stale the second you’re done, often it’s not used, there are differences in testing and eventually it’s flushed down the toilet and rebuilt.  Automation can fix this, cloning environments and data is great.  Technology can’t fix people issues though.

With this in mind, Docker clicked for me.  The issue with most environments is the tweaks or the shadow IT that adds things after a deployment.  Every time I heard the word “done” I ask again and usually find out that the person means 90-99.9% done, which is not done, only 100% is done.  Dockers solves this because you deploy on the same engine and if you changed something, you’re caught. It’s rebuilt EVERYTIME.  This is a GOOD thing.  This means that Dev, QA, Preprod and prod are the same and the infrastructure doesn’t get stale because there isn’t much to manage.

Developers like this because there is no change to the infrastructure, ops should like this because it takes them out of the blame.  I think most will not like giving up control, but if you think hard about this, they aren’t, there is nothing to control.  You can backup the container data and have that persist, but just as easily clone and restore if needed.

I have a lot further to go, I’m just a beginner but I thought I’d share how I’m getting some uses for Docker and I really do see it as the future (no clue what date that will come).  BTW, Docker runs on VMware, generic Linux, AWS, Google Cloud and many more.  If the Engine is the same, the apps under it can be used on whatever platform you have.  Think of the savings in management on all those Guest OSes.  What about the licensing and support?!?

Granted, not everything runs on Docker, but everyday more does.  If I were a vendor, I’d be integrating it to utilize Docker before your competitor does.

PS – I didn’t even edit this thing so please pardon the grammar or runons.

Categories: DevOps, Docker Tags: , , ,

I’ve got problems but 99.999 (five nines) storage isn’t one of them

October 7, 2014 Leave a comment

I recently have been in front of a few customers discussing various designs for application and desktop virtualization.  Inevitably, or at some point, we discuss storage.  When it comes to storage I often pause and read the room because most people i know on the VAR and customer side have their favorites and have what I would refer to as a Dallas Cowboys team (I’m an Eagles fan, if you are a Dallas fan, just reverse the teams, it’ll work).

I’ve architected (is that a real word?) large deployments involving multiple datacenters, high availability and disaster recovery. My focus isn’t on what is the single best technology and gluing things together, it’s about what works (and hopefully, what works well).  Storage can be a very big issue with VDI, traditional SAN-based storage was not designed for desktop workloads and we’ve been oblivious to faster disk speeds and low latency on drives that hum under our wrists when typing.  Moving these workloads to the data center doesn’t always work and when you add in latency from a server reaching out to a separate SAN, it compounds the problem.

The traditional SAN isn’t usually the best fit for heavy desktops and applications, however, adding flash technology to the mix often deals with the IOPs issue and latency can be minimized.  Is flash necessary?  Nope.  I’ve had designs involving 15K SAS drives local to blades work very well.  The Citrix stream to memory, overflow to disk can perform even better with 10k or 7k drives.  However, I often don’t get to position that solution which brings me back to my first point…everyone has favorites.

I can take almost any storage and find a solution.  Even a traditional SAN, if I can use memory to cache, I can make that work.  Local disk? Easy.  Flash appliances, they are great!  But there is one thing I’m hearing that I don’t need.  The storage providing high availability or five nines.  There is a simple reason I don’t need five nines and I cringe when I hear others use it and lean back.

Your application doesn’t solely rely on storage to be available!

How will five nines prevent downtime when your hypervisor crashes or profile corruption occurs?  What about a failed backup on SQL that just eats up disk space?  What should we do?

We need to embrace failure and assume things fail.  It’s so much cheaper than having the hardware give you a warm fuzzing feeling.  When that business app fails, the business doesn’t care whether it’s storage or a cleaning person tripping over a server cord (I hope that isn’t even possible in most of your environments!). They see IT as the failure, not storage.

I wish I could take credit for this thought process but netflix has pretty much perfected this thought.  If you haven’t heard of the chaos monkey you should learn – http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html .

Spend enough time in IT and you’ll realize that chaos always wins and you burn out quick if you’re fighting it.  However, returning to my original point, the design and architecture can do this also.  When we talk of desktops, many argue persistent versus non-persistent.  Persistent means you keep your desktop, non-persistent means you can roam (which usually means some flavor of roaming profiles).  I’m a big advocate of non-persistent.  Your storage or server fails, you get logged off, you log back in and you’re right where you were (or very close to it).  If the application is database driven and supports mirroring, you can survive storage failures, if setup correctly.

Going back to storage, this means two of whatever I have.  Two local drives, two appliances, two SANs.  I’ll take two 95% up-time appliances over a single 99.999% appliance anytime.  I’d rather save costs with single controllers than try to make a single point of failure not fail (because your application never has a single point of failure, it’s got multiple points of failure).

I’m not arguing five nines doesn’t have a place somewhere.  If you can’t use non-persistent, it might be for you.  However, I’d argue that virtualizing your applications and desktops is not a good move if you need persistence anyways.  Just my two cents, feel free to comment if you agree, disagree or think I’m full of it, I’m always open to suggestions!

PS – This is a first draft to publish, I’m sure there some typos and run-on sentences in there.

Categories: Citrix, microsoft, vmware Tags: ,

InfoBlox and Citrix Issues [RESOLVED]

May 15, 2014 1 comment

I have heard a lot on Infoblox issues with Citrix and had the chance to meet some of the Infoblox team today for lunch and a meeting.  My first question, and Kevin Dralle’s, whom I work with, was about the apparent incompatibility of InfoBlox and Citrix, especially with PVS.  Please comment if you think this doesn’t work or has issues.

 

Some of the issue have been described elsewhere (I know Jarian Gibson has wrote and tweeted a few things on this also)

http://discussions.citrix.com/topic/307967-dhcp-issues-with-pxe-boot-and-win7-os-streamed/

http://discussions.citrix.com/topic/301193-provisioned-desktops-with-infoblox/

With Infoblox there is a CTX but it’s a bit mysterious on details

http://support.citrix.com/article/CTX200036

So what is going on with InfoBlox, anytime we have had customer with InfoBlox on and Citrix we cringe or opt for perhaps a dual NIC, isolated PVS VLAN (using Microsoft DHCP).  In any case here is what happens.

InfoBlox assigns the device a UID based on MAC but also on some of the device characteristics. So when we boot off PVS, we bring up the bin file which acts as the OS at the time of PXE boot.  We have a static MAC but since after the bin file pulls the TFTP image then brings up a windows OS, the UID changes, which infoblox assumes you’ll need another IP address.  Obviously there are use cases for this but for PVS this is an issue as you’ll get two IP addresses.

One fix has been to use reservations but this defeats the whole purpose of using an appliance or solution to manage this all.  Furthermore, when or if you get into automation and orchestration, you’ve got one more component to worry about when increasing the scope.

You do need to be on the 6.6 or higher release for this option but it is worth it if you have this issue or are an InfoBlox shop and want to rollout PVS without trying something uncommon to deployments (using BDM has a lot less collateral out there than PXE boot does).

Below are the two areas from the Grid and Member layers where you can set this (courtesy of InfoBlox!).

ImageImage

Categories: Citrix Tags: , , ,

Citrix SynergyTV – SYN119 – How Atlanta Public Schools delivers virtual desktops to 50,000 students #citrixsynergy

Categories: Citrix, XenDesktop Tags:
Follow

Get every new post delivered to your Inbox.

Join 784 other followers