Kathy Allen yXQp California Street Berkeley, CA 94703 (510) XXX-XXXX kallen at groknaut dot com Skills Overview --------------- OS: primarily Linux (RedHat, CentOS, Debian), a generalist in MacOS, Solaris, Windows Scripting/Languages: bash shell, ruby, perl, some SQL Networking: TCP/IP, BGP, DNS, DHCP, load balancers (HAProxy, Perlbal, BigIP). Juniper, Force10, and Cisco switches, routers, and firewalls. Hardware: x86 hardware (Dell, HP, Rackables, Supermicro), serial console (Lantronix, Cyclades), Netapp Filers, Sun (Netra/SunFire/Enterprise) Applications: Chef, CFengine, Apache, Nginx, Linux Kickstart, RPM and Solaris package building, Memcached, Mogilefs, NFS, BIND, OpenLDAP, MIT Kerberos, HA-DRBD, RAID, openssh, openssl, iptables, version control (GIT, SVN, CVS), proxy cache servers (squid and varnish), email (SMTP, POP, IMAP4, MTA: postfix, sendmail, qmail), monitoring (Nagios, Ganglia, Cacti, SNMP, Cricket), databases (MySQL, PostgreSQL) Other: AWS, EC2, S3 Work Experience --------------- (Apr 11 - present) Operations Engineer, Okta Inc., San Francisco, CA (Jan 11 - Apr 11) Sr UNIX Operations Engineer, SAY Media, San Francisco, CA * Six Apart and Videoegg merged to become SAY Media in November 2010. (Nov 06 - Dec 10) Sr UNIX Operations Engineer, Six Apart, San Francisco, CA * Deploy and maintain IP transit links with Tier 1 and 2 providers. Tune BGP to keep traffic on links balanced, and at, but not over, commit levels. * Replaced Cisco PIX site-to-site VPN functionality with OpenVPN and iptables running on low-power, high-availability linux servers. This provided us simple and reliable encrypted tunnels, without needing to renew support contracts. * Responsible for a range of network administration tasks including creating network maps and documentation, firmware upgrades on everything from edge routers to core switches, monitoring of devices and traffic, maintaining layer 3 ACLs, automating tasks where possible. * As the lead sys admin for LJ, I integrated what was a wholly separate set of systems, networks, and practices into the fold of 6a systems and standards, standards I helped design. This effort included many pages of documentation, the setting forth of hostnaming and DNS conventions, integration and improvement of automation with CFengine and Kickstart, rebuilding the hodgepodge of hardware on which LJ ran onto standard sets of hardware and practices, the improvement of monitoring including aspects which had yet to be monitored, and application performance tuning and balancing. 6a sold LJ in December of 2007. The points below are aspects of what I did for all of 6a's properties, not exclusively LJ. * Tuned apache on various application systems for best performance and use of system resources. Performed capacity tests on a variety of application layers (e.g. apache, perlbal) at peak and non-peak times to determine proper allotment of physical servers needed. Documented method of capacity testing for future use and tuning. * Set forth policies on access and privilege control, doing so collaboratively, with buy-in from management and both sides of the aisle: ops and engineering. * Refactored the CFengine setup, untangling legacy configurations and practices into a structure that is much more managable and intuitive. * Primary lead on datacenter move of all of 6a's systems with little to no downtime. During the move, improved many aspects of systems and practices, including standard host builds enabling the team to deploy tens and hundreds of hosts quickly. * Wrote a Disaster Recovery Planning document for 6a, covering various disaster scenarios and mitigation. As a result of researching this document I helped identify places in the environment where improvements need to be made for resilience and quick response in case of disaster. * Designed and deployed DNS configuration automation, using IPPLAN, perl, and bash trigger scripts, with plenty of error checking and reporting. (Jun 04 - Jun 06) Sr UNIX Systems Engineer, Shopping.com (an Ebay company), Brisbane, CA * As part of the Sarbanes-Oxley compliance effort, I designed and implemented a central authentication and authorization system using MIT Kerberos and OpenLDAP for both UNIX systems (Solaris and Linux) and in-house applications. Access for a given individual could be finely controlled, including giving such access to one host, or a group of hosts. Given that user management was previously done by hand using local files, central auth provided greater efficiency, far less errors, and an audit trail when managing user accounts and their access and privilege level. Also advised engineering staff on designing their applications to use this centralized auth system. Wrote various scripts to simplify LDAP and Kerberos data management, as well as automate batch conversions of user data into the new system. * Designed and implemented a remote network OS install system using kickstart, pxelinux, and serial console. This greatly increased the ease and output rate of installing the OS on hosts, having the department stop installing OS on hosts by hand using a CD-ROM, and instead remotely installing the OS over the network using standardized, revision-controlled install profiles. * Administered and documented the central configuration tool, CFengine. Planned and executed a controlled cutover of about 1300 clients from old legacy CFengine servers to new ones. * Instrumental as operations lead for the Merchant Account Center, a new application using apache and tomcat. Advised engineering during the design phase of the new application regarding production environment requirements, including ease of management, scalability, redundancy, user account management and privilege level, security, monitoring, etc. Continued to advise through QA cycles, and executed rollout of the new application. This close pairing of operations lead and application engineering was regarded as exemplary as it led to the most successful new application rollout the company had yet seen. Future projects coming from engineering followed this model I helped provide. During subsequent projects, I continued working closely with engineering staff helping them to produce applications that are more manageable in general in QA and production environments. Advised on the organization of application configuration files and better parameterization. Trained staff on methods of package management (RPM and Solaris packages), ultimately improving application deployment in both QA and production environments. (Sep 00 - Jun 04) Sr UNIX system administrator, CollabNet, Brisbane, CA * Designed and maintained systems in a 24x7 production operations infrastructure with other team members. Used tools such as Jumpstart, Kickstart, and autoinst to automate the deployment and maintenance of hosted application environment which serves online collaborative software development and project management. * Implemented, maintained, and documented nameservers, mailservers, system monitoring, remote console, Kickstart and Jumpstart servers, RPM and Solaris package build systems. * Advanced production, development, QA, and corporate network and server infrustructures from undocumented, unreliable, systems with high administration overhead to robust, reliable, repeatable systems. Examples include, but not limited to: a) Deployed clustered Netapp Filers for customer production data; designed and implemented disaster recovery with two cross-country datacenters, using a second Netapp cluster mirroring over one terabyte of customer data in near real-time. As a result, we achieved near-zero downtime (99.99+%) for production systems using this design b) Separated dev and QA environments into isolated subnets, provided NFS shared home directory access for systems all under either Kickstart or Jumpstart control. c) To prevent dev/QA abuse of the corporate mailserver (abuse which severely degraded corporate mail availability), I composed a spec based on dev/QA requirements for an internal mailserver, and deployed said server, greatly improving corporate mail access. * Script writing in bourne shell and perl. Purposes include, but not limited to: log rotation, archiving, and expiration; automated nightly RPM and Solaris package build system; dynamic DNS cgi tool for QA's use; locating CVS repository files of particular size, obtaining CVS ownership, last modification time, and parsing results as requested; simple user account management both on a system and application level. * Implemented a documentation repository for operations, maintained docs, and always encouraged other operations personnel to participate. * Assessed datacenters for hosting production systems. Helped plan and execute datacenter installations of network components, NAS, and servers. In June 2002, my team planned and executed a massive datacenter move from San Francisco to San Jose given only two weeks notice. Expanded same datacenter in Spring 2004, planned rack layout for maximum server density to achieve the best cost benefit, and deployed third Netapp Filer cluster. * Researched and presented proposals for a new data storage purchase for our rapidly growing customer data. Attention given to planning the actual data migration, giving minimal downtime for customers. Also researched proposal for similar data storage system for QA and development environments. * Much system and application troubleshooting and bug hunting, including, but not limited to, the following: CVS usage causing excessive server load, swapping; finding and fixing MySQL database corruption; diagnosed spindle contention on backend datastore, migrated data across more spindles. (Feb 00 - Aug 00) UNIX System Administrator, Critical Path Inc., San Francisco Responsible for the installation, configuration, and management of 1000+ Solaris and FreeBSD systems in 6 datacenters around the globe. Included performance tuning of SMTP, POP and IMAP servers in an environment hosting more than 18 million accounts. Duties include but not limited to jumpstarting hosts, maintaining DNS, remote console servers. Very fast-paced 24x7 production environment, with firefighting and oncall duties. Paid particular attention to issues of scalability, automation, performance, and redundancy. (Aug 99 - Feb 00) Network Operations Center Specialist, Critical Path Inc., San Francisco Monitored all production systems including network, email, calendar, and directory services in an environment that hosts over 18 million user accounts. Identified and corrected problems before they became customer affecting. Performed troubleshooting on all UNIX hosts (running Solaris and FreeBSD), and escalated to senior ops staff when necessary. Duties included but not limited to: updating DNS, restarting services, moving redundant clustered hosts in and out of service on the NFS backend data store, installing RPMs, writing shell scripts. Maintained documentation of production and monitoring environment. Trained incoming NOC specialists. (Dec 98 - Feb 99) UNIX System Administrator, Taos Mountain, Santa Clara, CA. Consultant for Critical Path, a company that provided outsourced email services. Monitored system processes and log files to locate potential errors and service problems. Solved service problems as they arose or escalated to higher level admins, providing them details. Solved DNS problems with new customers advised them on the configuration of their DNS records. Updated system operations staff on system performance and any service outages or problems. Tested various services including POP, HTTP, and SMTP during system maintenance windows. Systems ran Solaris 2.6 and FreeBSD in a clustered configuration in a very large networked environment. (Feb 97 - Dec 98) Systems Specialist, Genentech Inc., S. San Francisco, CA through a contract from Interim Technologies. Provided Windows and Mac desktop support, and primary email technical support for 3000+ users. Personal skills --------------- - Excellent analytical and problem solving skills - Superb communication and organization skills - Proven ability and flexibility to learn new tasks quickly - Highly effective as a team player and self-motivated working alone Education --------- - Yale University, New Haven, CT, B. A. Film Studies, GPA: 3.4 (Aug 90-May 94) References available upon request