IRC Logs for #cmvt Tuesday, 2013-07-16

prologicDanielBaird:  working on BCCVL today?01:18
*** DanielBaird has quit IRC02:08
*** DanielBaird has joined #cmvt02:29
DanielBairdback now.  Um i might have some ccav time today.  I guess i'll pull down the latest data building, make a clean data location, and get that working02:50
DanielBairdfix the 404s etc02:50
prologicI warn you03:04
prologicit now takes nearly an hour to generate summaries03:04
prologicI'm working on trying to speed this up :)03:04
prologicmy suggestion would be to fix any bugs you currently have in the UI03:05
prologicand wait till I've sped up gensummaries :)03:05
prologicor I can tar up my "data" dir03:05
prologicand you can download it from somewhere03:05
prologicit'll be faster currently to download then to generate :)03:05
DanielBairdyeah okay i'll just not worry about the data and 404ing.. instead i can go look at the map layout03:12
prologicthat sounds good03:18
prologicI had a quick chat to AB about where we're at03:19
prologicthe map script side is a bit of an unknown03:19
prologicwant you to have a quick look at this:03:21
prologicI've developed a single-process (much simpler) tool that does intersections of a given raster layer against every geometry in a region collection03:22
prologicthis is the output03:22
prologicnotice the _IBRA_<id>.tif files?03:22
prologicthe <id> is the OBJECTID of the feature (if there is one) otherwise the "id" of the feature (if there is one) otherwise an integer counted up from 0 as I loop through all the region features -- skipping any features with null geometries03:23
prologicthis takes 15s to produce here03:23
prologicwith multiprocessing I expect to cut that down even smaller03:23
prologicso making intersectraster use multiprocessing has yielded a 200% performance improvement03:59
prologicand I'm outputting smaller GeoTIFF files per region geometry03:59
DanielBairdhmm i now have a couple of symlinks, e.g.  webui -> src/webui04:43
DanielBairdoh i see that was just recently.. i hadn't refreshed my bitbucket page since then04:44
prologicI'm a lazy developer :)04:46
prologicI do things for convenience04:46
prologicok not lazy04:46
prologicbut you type:04:46
prologicvim src/ccav/ every time04:46
prologicvim src/webui/js/...04:47
prologicperformance improvements are looking promising04:47
prologichaven't tied it all together yet04:47
prologicbut yeah04:47
DanielBairdthat's okay i was momentarily worried that i'd run something in the wrong dir04:47
prologicdid you read?04:47
DanielBairdyes, looks good04:47
prologichopefully I'll be able to modify and greatly simplify gensummaries04:47
prologicand it should run a whole heap faster04:48
DanielBairdi started a data refresh anyway, so i can give the graph page a last check over.. i haven't seen the states yet04:48
prologicI have 2hrs left of the day04:48
prologiclet's see if we can't improve this04:48
prologicand shove the full datasets in04:48
prologicof the two we're using04:48
prologicit takes an hour on my Mac04:48
prologiciMac that is with Core i704:48
prologicso be warned04:49
prologicnext version of gensummaries should hopefully by at minimal 2x faster04:49
DanielBairdit'd be prettier to have a few years, like three, and five or so vars, and ten or so regionjs04:49
DanielBairdregions * lol too used to typing "js"04:49
prologic10 or so regions?04:49
prologicyou have the full regions already04:49
prologicall of IBRA, LGA, NRM and States04:50
prologicwhat other region collections were we planning on using?04:50
DanielBairdyeah i'm saying if we had to trim down the amount of processing, having the first ten regions in each of those region types would be okay if it let us have more variables & years04:50
DanielBairdoh yeah states, guess i should add another tab.  i'll automate the tab building from a region type list04:51
DanielBairdah i missed your email, full datasets for limited models is even better04:52
prologicI thought so :)04:54
prologic*fingers crossed* I can pull it off in time04:55
prologicbtw DanielBaird04:57
prologicthis whole determining what State another region is in04:57
prologicdo you see this as a general/generic thing?04:57
prologicwhere we want to identify what IBRA, NRM, LGA region is in what State?04:57
prologicor more precisely04:57
prologicidentifying geometries of a region collection that intersects another and creating a simple mapping of such04:58
prologicwell in the current gensummaries04:59
prologicI am doing exactly this anyway04:59
DanielBairdit was a significant part of the wireframe thingy that i've been working off, e.g. without state category, the donut graphs aren't very interesting05:00
prologicintersecting every geometry of every region collection against the States05:00
prologicand spitting out that relationship05:00
prologicwell in that case I'm just going to do this05:01
prologicas it's nice and simple and statically generated05:01
prologica intersectvector <vector1> <vector2> tool05:01
prologicthat will do exactly that and create a mapping05:01
DanielBairdso an overlapping region gets several states attached?05:02
prologicno it only gets 105:02
DanielBairdthe best one :)05:03
prologicafaik (according to the intersection and R-tree indexing) it's the most overlapping geometry05:03
prologicI'm pretty sure that's how the R-tree indexing works by bounding box05:03
DanielBairdcool that's what users will expect, I guess05:03
prologicI could be wrong :)05:03
prologicoh gawd05:28
prologicI'm going to be really lazy here05:28
prologicand instead of rewriting the whole of gensummaries05:28
prologicam just going to just leave it as is05:28
prologicand take advantage of the newly created files from intersectrasters05:28
prologicoh well :)05:28
prologicif it ain't broke don't fix it right :)05:28
DanielBairdyes.. and gensummaries isn't running every day or anything.05:30
prologicyou should see the code05:31
prologicit kinda can't follow it anymore :(05:31
DanielBairdthat's efficiency, that is.. when your code turns into concrete juuust when it's working and doesn't need any more editing05:32
prologicyeah I mean it works05:33
prologicso why change it right?05:33
prologicthe only part I need to speed up is the intersections, clipping and maksing05:33
prologicwhich is what intersectraster and intersecrasters (plural) now do05:34
prologicthe later uses plumbum (nice pythonic oo shell we rapper) to find all bioclim geotiff files and all region shape file and run intersectraster against them05:34
prologicwhich takes 44.7s here05:35
prologicargg damnit05:38
prologicmaybe I do have to rewrite it a little05:38
DanielBairdit looks like sometimes the state comes out as null.. is that expected>05:39
prologicdoes it?05:45
DanielBairdin my last fab data, there are lots of null states.05:46
prologicit is possible05:46
prologicthere are some regions that are outside of any state boundaries05:46
DanielBairdoh maybe the intersection only returns a state when the region is completely inside a state?05:46
prologicfor example05:46
prologicthere are some IBRA regions05:46
prologicwhere their boundaries extend beyond any state boundaires05:47
prologicand even extend beyond the extends of our models05:47
prologicsuch as Antartica05:47
prologicand Cook Isalnds05:47
prologicwe just simply do not have data for such parts of the region's geometry05:47
prologicso I try my nest (in code) to clip the arrays themselves05:47
prologicbut nothing I can do about matching the states05:48
prologicwhat would I clip the geometry to?05:48
prologicit's easy when clipping and masking a model05:48
prologicI just clip to the bounds of the model05:48
prologicif the geometry exceeds or tries to index negatively the model05:49
DanielBairdsorry in convo elsewhere..05:57
DanielBairdback now05:57
DanielBairdit looks like there's only a couple with states set, it might be my data massaging i'll check it through.05:59
prologicI thought there were  more matches than that06:00
prologicI'm about to find out though06:00
prologicI've just written and about to test intersectvector06:00
prologicUsage: intersectvector <vector_file1> <vector_file2> <output_file>06:01
prologicwhich will hopefully create a mapping of vector_file1 geometries to vector_file2 geometries06:01
DanielBairdjust one single region, the first IBRA region, has a state set..all the rest are null06:01
prologicoff this is something more suited to a database join of two vector tables06:02
prologicbut oh well :)06:02
prologicI'll look into that06:02
DanielBairdweird that it's the first one that works, smells like a loop error or something like that06:03
prologiclook here06:04
prologicthe tool I just quickly wrote says there are many more matches than 106:04
prologicin fact I'm really happing with this kind of mapping06:05
prologicnice fast O(1) lookup06:05
prologicI could easily deal with06:06
prologic3 separate mapping files06:06
prologicI think with this final tool06:06
prologicI can rewrite gensummaries to not be such insane code with 7 nested loops06:06
prologicsorry 6 nested loops :)06:07
DanielBaird my IBRA summary for current...06:08 might just be that my fab data isn't up to date, have you changed it today?  i think i fetched early afternoon06:09
prologicno I haven't06:10
prologicit smells of a bug :)06:10
prologicI'll hunt it down06:10
prologiccheck something for me06:12
prologicyour version of fiona06:12
prologic$ python -c "import fiona; print fiona.__version__"06:12
prologic$ python -c "import fiona; print fiona.__version__"06:12
DanielBairdsry 0.12.106:12
prologicdo a "fab develop"06:13
prologicget your dependencies up-to-date :(06:13
prologicthere was a bug in Fiona < 0.16.1 that caused this06:13
DanielBairdah cool06:13
prologicor rather old code had to reopen the data source every single time I wanted to search it06:13
DanielBairdthis is why i should be doing all the dev in a vm06:13
prologicSean Guilles fixed it06:13
prologicwell doesn't matter so much06:13
prologicyou weren't aware :)06:14
DanielBairdyep that's updated it.  alright i'll kick it off again06:14
prologicalthough I suppose you might have seen something in the commit messages about it :)06:14
DanielBairdactually i'll fetch first06:14
prologicI did talk about it in my commits06:14
prologicI'm outta here shortly06:29
prologicI'll see you in the morning06:29
DanielBairdkk np06:29
prologicI'm going to update the data sources to the full data sets06:29
prologicof the two models we're using06:29
prologicand run it against my Mac overnight06:29
prologicsee how it goes06:29
prologicand time it06:29
DanielBairdi'm hoping to get this data build done before i go home, with any luck i've gotten the display ready for states etc06:31
prologicI'm going to try and hope to speed up the generation tomorrow if that's the only thing I do06:33
prologicit'll be one task off the list06:33
prologichere goes06:34
prologicdoing a full deploy on my local dev vm06:34
prologicwith full datasets for the two models we're using06:34
prologicit'll probably take all night :)06:34
prologicbut at least I'll be able to tar up and copy the data dir to NECTAR06:34
prologicok I'm outta here06:38
prologicI hope the deploy works :)06:38
DanielBairdsee ya06:38
*** DanielBaird has quit IRC08:01
*** DanielBaird has joined #cmvt08:31
*** DanielBaird has quit IRC08:40
*** DanielBaird has joined #cmvt09:06
*** DanielBaird has quit IRC09:11
*** DanielBaird has joined #cmvt10:07
*** DanielBaird has quit IRC10:11
*** DanielBaird has joined #cmvt12:30
*** DanielBaird has quit IRC12:35
*** DanielBaird has joined #cmvt13:31
*** DanielBaird has quit IRC13:36
*** DanielBaird has joined #cmvt13:58
*** DanielBaird has quit IRC14:24
*** DanielBaird has joined #cmvt14:55
*** DanielBaird has quit IRC15:04
*** DanielBaird has joined #cmvt15:30
*** DanielBaird has quit IRC15:36
*** DanielBaird has joined #cmvt16:33
*** DanielBaird has quit IRC16:36
*** DanielBaird has joined #cmvt17:33
*** DanielBaird has quit IRC17:38
*** DanielBaird has joined #cmvt19:52
*** DanielBaird has quit IRC19:56
*** DanielBaird has joined #cmvt20:22
*** DanielBaird has quit IRC20:26
*** DanielBaird has joined #cmvt21:49
*** DanielBaird has quit IRC21:53
*** ChanServ has quit IRC22:43
*** robert_pyke has quit IRC22:43
*** prologic has quit IRC22:43
*** DanielBaird has joined #cmvt22:46
*** robert_pyke has joined #cmvt22:46
*** ChanServ has joined #cmvt22:46
*** prologic has joined #cmvt22:46
prologicDanielBaird:  Good Morning23:18
robert_pykeEvening all ;)23:20
prologicargg fuck23:20
prologicI forgot about disk requirements23:20
prologicshit and bbq23:20
prologicwe really should chart up what the disk requirements really are23:24
prologicbased on no. of models and resolution23:24
prologichopefully 12GB ought to be enough for 2 models :)23:24
*** DanielBaird has joined #cmvt23:26

Generated by 2.11.0 by Marius Gedminas - find it at!