--- mdomsch has changed the topic to: FudCon: Next Scheduled Topic: MirrorManager 18:00 UTC Aug 15 http://domsch.com/linux/fedora/mirrormanager.odp can I get some comments here, on people's preference; SIP vs just IRC ? --- tux_440volt is now known as BRB <-- euclid (n=hyclak@euclid.math.ohiou.edu) has left #fudcon mpdehaan, I would prefer sip. * dbewley says no sip for him --- BRB is now known as tux_440volt 0 * mpdehaan votes non-SIP --> fcrippa (n=fcrippa@83.224.139.85) has joined #fudcon <-- TechJournalist_ has quit (Read error: 104 (Connection reset by peer)) can we record sessions on the SIP ? if not, +1 for IRC.. that way we will have a log :) lmacken, I don't think so yet ok, for today then, no SIP, unless you want to heckle me in the background sip:infrastructure@fedoraproject.org is active, but I'll do the presentation in IRC everyone should first download the slide deck from /topic so, brief introduction I'm Matt Domsch. I'm on the Fedora Project Board, and I'm the self-appointed Fedora Mirror Wrangler I also happen to work for Dell, but please keep your Dell hardware questions till the end. :-) About 9 months ago, I realized that no one was intentionally coordinating the efforts of our ~200 mirror server administrators with any regularity we had several add-hoc processes, lots of manual mucking about with text files in CVS as mirrors came and went and generally, we weren't making as good a use of all the volunteer effort of our mirror admins as we could be which resulted in higher workloads for some mirrors, zero workload for others, and a slower download experience for our end users I hate mucking with text files, when a program can do it for me, so, I wrote MirrorManager <-- fcrippa has quit ("Leaving") Go to Slide 2 We have three main audiences: end users looking to get content quickly the mirror server administrators and people in the Fedora Infrastructure project who try to coordinate between those two groups Oh, feel free to ask questions as we go along Slide 3 What kind of experience do we want to offer to our end users? That's the driving question behind all of this. End users should be able to get our bits, fast. --> fcrippa (n=fcrippa@83.224.139.85) has joined #fudcon Fast in network terms generally means that if there's a mirror "close" to you network-wise, chances are that is the fastest server from which you should pull we've built in a few ways by which we determine "closeness", e.g. GeoIP country lookups, network IP address block lookups, etc. We also don't want to serve content from a "stale" mirror, one which isn't up-to-date with the content on the master servers else those security updates may get delayed in being pushed out We want it to work seamlessly with YUM, so the user types "yum upgrade" and it "just works" for them --> jbowes (i=jbowes@nat/redhat/x-b78325404a011f4b) has joined #fudcon and, we want to offer a plain web page view of who has what, so if you want to manually browse, you can. questions? Slide 4 I mentioned we've got ~200 mirror servers globally? That's a lot of volunteer hardware, bandwidth, and sysadmin time - we want to use that efficiently. For sysadmins, they have various reasons as to why they offer to mirror Fedora. Many have a large pool of "nearby" users that they want to serve. Universities, companies such as Dell, etc. Some make their mirror servers "public" so anyone can download from them, some "private" that are within the organization only. MM knows the difference We don't make mirror admins set up new services, so if they're only serving HTTP, that's fine, or FTP, that's fine, or even rsync, that's fine. We just need to know the paths by which users can get to the data --> jmrodri (i=jmrodri@nat/redhat/x-bfd61bf8c77f6f21) has joined #fudcon and because they know their own hardware, and their own user bases, mirror admins are free to exclude any content they want so for example, 82 mirrors carry rawhide, while 94 carry Fedora 7 and only 62 carry F8test1 They can also exclude any architectures they want there are 94 carrying F7 i386, 91 carrying x86_64, and 67 carrying PPC and they can change that daily if the wish. so, we need some way to know what content each has Boston University, for example, doesn't carry ISO images again, this has to be fine - it's entirely a volunteer effort * mdomsch makes note to mmcgrath - need to think about that for the redirector everyone still with me? Slide 5 So, these are the basic questions that the Fedora Mirror Wrangler deals with. who has what where are they located how can an end user reach that data and where are the end users coming from, so we can tell them what the closest mirror server is on release days, (much more so before F7, for F7 we did a great job of handling the load), especially on release days the primary fedoraproject.org web servers couldn't handle the load spike the website team did a great job of making fast, small, pretty, cacheable web pages for the F7 launch but just in case that wasn't enough, we also started mirroring out the static Fedora main web page fp.o to the mirrors so we could redirect people at the global mirrors if we had to. We didn't need to use that, but it's there now for the future when we get slashdotted, dugg, etc all at once. ok, that's the background, any questions or comments so far? maybe OT, is there any thought of using bittorrent in yum to lighten load on mirrors for updates? dbewley, not OT, it's a fair question <|DrJef|> dbewley, that comes up...a lot.. bittorrent does not scale to small packages short answer is no, not BT <-- pyurt has quit (Success) <-- tux_440volt has quit ("Konversation terminated!") <-- djuran has quit ("Leaving") BT is great for ISOs and constant directory trees less so for changing directory trees we have looked into metalinks which has the potential to split up a download across multiple servers "on the fly" but it's not yet implemented. "Just a simple matter of code" though if you're interested :-) thanks. :) i just find the idea of bittorrent and p2p romantic i think ok, Slide 6 the grand vision pretty much everything I've said so far, focused on the best end user experience possible using only Free/Open Source software --> fcrippa_ (n=fcrippa@83.224.160.238) has joined #fudcon Slide 7 we'll be on this slide for a while now <-- fcrippa has quit (Read error: 104 (Connection reset by peer)) very high level diagram of who the players are, and how they're separated heavy lines are bandwidth-intensive links light arrows are small requests (e.g. requesting the yum mirrorlist) and/or light metadata updates First, let's look at the arrow from bottom left going upwards to the right <-- tatha has quit ("Leaving.") fedora, via Red Hat's infrastructure, has several "master" mirror servers onto which Red Hat I/S stages the content these masters are in Raleigh, Tampa Bay, and Phoenix in addition, we recently added a new master, at iBiblio in Raleigh, connected to Internet2 these servers are access controlled, so only official mirrors can use them and all the various official mirror servers pull from one of these 4 multiple times a day rsync is your friend now let's look at the right side, the gray boxes <-- _bernie has quit (Remote closed the connection) here we have the MirrorManager application itself, split into a 3-tier architecture (only 2 tiers shown) the MM application runs on several servers owned by the Fedora Infrastructure team and all instances of the app talk to a single instance of a database, also on a server owned by the team and there are web pages served at mirrors.fedoraproject.org/publiclist and the like handled by the application layer To populate the database with state information about each mirror (who has what right now), there is a web crawler application that crawls each public mirror and records info about what is avaialble from each into the database we also have a lightweight application 'report_mirror' which can be run on each mirror server itself, which does a self-crawl and reports the data back to the database over XMLRPC-over-HTTPS report_mirror is especially useful for "private" mirrors (those which can't be accessed by the crawler because they're inside a firewall) and for the days leading up to a release, when the bits are getting copied out to places, but the permission bits keep the files from being world-readable so we've set the stage the mirrors have the data they want to carry the database knows what data each has now, we're ready to take client requests because we can now answer those requests sanely The clients make a request, generally using YUM /etc/yum.repos.d/*.repo files generally have a 'mirrorlist=...' line in them which is used to retrieve a list of mirrors, for a particular repository (release/version/architecture) now, we could return all 92 mirrors to that query, but generally, we want to only hand back mirrors that are 'local' to the user if we can first, we do a lookup based on the client IP, to see if there's a "netblock" that one of the mirrors claims as being near to them if so, we hand back those "near" mirrors. UUNet is a good example here. they have three mirror servers, and they're a major backbone network provider so all of their customers will get directed to download from one of their three mirrors rather than someone else on another network this is cheaper for them to offer, as they don't have to pay for transit via another network provider Likewise, if the user's org has a private mirror, e.g. Dell has a private mirror, we can use their IP address to automatically redirect users to pull from the site-local private mirror this netblock matching is a feature unique to Fedora as far as I can tell but I think it's one of the "killer" features of Fedora's mirror system now, if there's no match by IP netblock we fall back to looking up the IP in GeoIP, to get their country we look up that country in our list of mirrors by country, and return those mirrors from the same country with a caveat - if there's <3 mirrors in one country, we return the global list. This protects against the country mirror being down briefly. If there are <3 mirrors in the user's country, we return the global list (all 92 or so entries) for this reason, we actively solicit mirrors in countries where we don't have much of a presence, such as a China mirror coming online soon in addition, there's opportunity for a step inbetween "country" and "global", which is "continent" however, that requires an extension to the python-GeoIP code which no one has gotten around to coding yet. if there are any C->python binding experts nearby that want to help, it should be easy to do, the country->continent mapping is already available in libGeoIP, it's just not exported via python-GeoIP once we return the best mirrorlist we can to the yum mirrorlist request, we're done; yum then redirects its requests at one of the mirrors on the list we just returned questions? * ricky hears you, but no mic at the moment. ok ok, slide 8 I wonder how this system will tie in with what I heard about presto yesterday. ricky, that I don't know i missed that presentation Ah, OK. Depending on the policy of making deltarpms, it could a lot of space, maybe. I wrote MM in TurboGears, a slick web application development framework anyone here written an app in TurboGears? * ricky has been playing with some TG apps recently. it's python + SQLObject + Kid + a database it was my first big python project, and it handles the work amazingly well however, it's not fast and when you've got >3 million yum instances hitting your servers every day looking to see if there are updates TG wasn't fast enough to handle all of that online even spread over several servers my goal was to answer any client yum mirrorlist request in <0.3s but, apache + mod_python *is* that fast so, I trimmed a little tiny corner of the application out into a separate stand-alone app that's run by apache this runs on each of the load-balanced apache instances, and returns the mirrorlists based on the lookup criteria above, the answers to which are all pre-made * lmacken hasn't noticed any speed/memory issues with bodhi.. after weeks of uptime, it still sits at about ~50mb memory usage my TG instances were >100MB each Wow. python 2.4 isn't good about freeing up memory once you've allocated it 2.5 i'm told is better in this regard s/2.4// I'm sure if we combed through the MM TG code, and threw some 'del's in there, we could probably make a big difference lmacken, go for it! when I get a free second, I'll check it out.. toshio and I were talking about it in gobby the other day actually so anyhow, by splitting this piece out into a separate mod_python app, it's really really fast, because it's basically a hash table lookup by netblock, and if that fails, by country, and if that fails, return the global list it doesn't have to calculate anything I mentioned the web crawler. This is another custom app. I tried using urlgrabber, but couldn't get it to do what I needed which was, not HTTP GETs, but HTTP HEADs by the thousands so, I looked at how urlgrabber sat on top of urllib, and basically wrote my own very simple version of the same which makes sense it also does FTP nlst entries urlgrabber is about grabbing urls - not HEADs :) if the server doesn't serve FTP s/FTP/HTTP/ this walks all the possible directories (those on the master mirror servers), looking for that same content on each mirror, and updates the DB appropriately that walk can take a while, especially if HTTP keepalives are disabled but about every 6-12 hours, every server is crawled in addition, servers can run report_mirror after every rsync run to update the DB as to their state it walks the directory tree locally, which is pretty fast (2-3 minutes on a lightly loaded server) so, that's the basic architecture, and I've got 5 minutes left so, questions, comments, tomatoes have people felt their downloads have been better Well, for the Beautify web pages part, we should have mostly working templates (unless the original ones have changed recently) :) yeah, I need to pull in those changes F7 has been very good at updating since I installed it. yeah, last slide it just works MrHappy, that's my motto and goal "it just works" if nothing else, thank you all for attending if you wouldn't mind, if you haven't made a comment yet, just say something in the channel so I can get some idea of headcount and have a good afternoon/evening I was listening :) something As a side note- it'd be nice to save a transcript of talks on the wiki somewhere, for people that couldn't attend. ricky, agreed, that's why I put my bot in here today mdomsch: How can I get a copy of the mirrormanager code with the crawler enabled in the TG app? abadger1999, see http://hosted.fedoraproject.org/projects/mirrormanager