<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body><div> </div>
<div>On Wed, Apr 22, 2015, at 11:32 PM, Ciro Iriarte wrote:<br></div>
<blockquote type="cite"><p dir="ltr">Hi Bron, it makes sense from that perspective although it seems to imply a management nightmare. Do you use any management/automation (webscale if you want) framework?.<br></p></blockquote><div>Less than you might imagine :)<br></div>
<div> </div>
<div>We have a single file (production.dat) which contains all the layout information mapping from machines to slot numbers, and slot numbers to disks, for example:<br></div>
<div> </div>
<div>i30 30 t15 0 1000 e 10.202.80.1<br></div>
<div> </div>
<div>Which says that slots sloti30t01 through sloti30t15 are on server number 30, they have a zero sized meta drive (all meta is on the SSD) and a 1000 Mb sized data drive, running ext4 filesystem, and IP addresses from 10.202.80.1 through 80.15.<br></div>
<div> </div>
<div>And then a store based on that is:<br></div>
<div> </div>
<div>store23 n 0 90 sloti30t01 sloti15t03 slotti5t02 slotsi2d2t01<br></div>
<div> </div>
<div>That's where my <a href="mailto:brong@fastmail.fm">brong@fastmail.fm</a> user lives - it has replicas on imap15 (New York), timap5 (Iceland) and simap2 (Amsterdam). The 'n' says that the master should live in New York, the '0' is a bit bogus actually, as we'll see in a sec, the "90" says that it has a target maximum disk usage of 90%.<br></div>
<div> </div>
<div>store254 n future 0 sloti30t15 sloti29t15 slotti1t06 slotsi1d2t40<br></div>
<div> </div>
<div>This is a testing store, only one real user lives here, and that's my personal non-work account. All the other users are test users. The "future" says that it's running on the future branch of Cyrus, which is where we try out experimental code. This means that all the commands which find the correct binary for tools will look in the correct paths, like this:<br></div>
<div> </div>
<div>[brong@imap30 ~]$ cyr store254<br></div>
<div>Store: store254<br></div>
<div>Master: sloti30t15 (imap30) 10.202.80.15<br></div>
<div>Primary: sloti30t15 (imap30) 10.202.80.15<br></div>
<div>This: sloti30t15 (imap30) 10.202.80.15<br></div>
<div>Other: sloti29t15 (imap29) 10.202.79.15<br></div>
<div>Other: slotsi1d2t40 (simap1) 10.206.51.80<br></div>
<div>Other: slotti1t06 (timap1) 10.205.161.6<br></div>
<div> </div>
<div>sudo -u cyrus /usr/cyrus-future/bin/cyr_dbtool -C /etc/cyrus/imapd-sloti30t15.conf /mnt/ssd30/sloti30t15/store254/conf/mailboxes.db twoskip<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/reconstruct -C /etc/cyrus/imapd-sloti30t15.conf<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/dav_reconstruct -C /etc/cyrus/imapd-sloti30t15.conf<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/cyr_synclog -C /etc/cyrus/imapd-sloti30t15.conf -v<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/ctl_conversationsdb -C /etc/cyrus/imapd-sloti30t15.conf<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/squatter -C /etc/cyrus/imapd-sloti30t15.conf -v -i<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/sync_client -C /etc/cyrus/imapd-sloti30t15.conf -n sloti29t15 -v<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/sync_client -C /etc/cyrus/imapd-sloti30t15.conf -n slotsi1d2t40 -v<br></div>
<div>sudo -u cyrus /usr/cyrus-future/bin/sync_client -C /etc/cyrus/imapd-sloti30t15.conf -n slotti1t06 -v<br></div>
<div> </div>
<div>So I can even run 'cyr <a href="mailto:brong@fastmail.fm">brong@fastmail.fm</a>' and it will give me the correct commands to run for my user.<br></div>
<div> </div>
<div>If it wasn't heavily automated, it would be a pain. Configuration files are built from Perl Template-Toolkit using Makefiles and data from the production.dat file.<br></div>
<div> </div>
<div>What we don't have so much yet is automated user moves or disk layout building, though it's semi-automated. I have a script which can be told "make config for 5 new stores" and it will find the least used machines, within the constraints we have for placing slots, and pick out empty slots on them.<br></div>
<div> </div>
<div>For moving users, 'MultiMove.pl' knows about disk usage on backends and can pick random users on busy backends to move. Our MoveServer.pl script is very smart, it does what Ken at CMU and now Ellie have done in the upstream branch with Sync-based-XFER, but externally. It runs sync_client 3 times, plus squatter, plus cyr_expire for archiving and locks out users in the DB, fiddles caches, etc. The upshot is that the user gets about a 3 second pause, and their connections drop, then they keep on working as if nothing happened.<br></div>
<div> </div>
<div>Bron.<br></div>
<div> </div>
<div>Bron.<br></div>
<div> </div>
<div id="sig567075"><div class="signature">--<br></div>
<div class="signature">Bron Gondwana<br></div>
<div class="signature">brong@fastmail.fm<br></div>
<div class="signature"> </div>
</div>
<div> </div>
</body>
</html>