[yam] python2.7 version and run yam correlate on subsets of data

Wed Jul 18 10:43:21 CEST 2018

Dear Blaž,

 > Just one more question. If i run yam from command line on a folder 
with multiple years of data inside, and specifying only one year in the 
parameter file, it will process just that year, right? And if then i 
change to another year in the param file, the process will add to the 
already existing database?

Yes, that should work as expected. A better option might be to use the 
"startdate", "enddate" options in the config of the correlate command. 
Define a base configuration for correlate and overwrite "startdate" and 
"enddate" for each call to yam correlate. This could be done with 
separate config options (aka "based_on") or by just changing the 
parameter file.

I hope you don't mind if I post my answer on the yam mailing list. 
Others might be interested.

Good luck,
Tom

On 17.07.2018 20:18, Blaž Vičič wrote:
> Dear Tom,
> Too bad, thanks! I was not really planning to use multiple nodes but just one in order to use multiple processors, so multiprocessing shouldn't be an issue. Ill try to run it on my pc in the meantime , and ask our IT if they can make py3 work on my account.
> 
> Just one more question. If i run yam from command line on a folder with multiple years of data inside, and specifying only one year in the parameter file, it will process just that year, right? And if then i change to another year in the param file, the process will add to the already existing database?
> 
> Thanks,
> Cheers, Blaž
> 
>> On 17 Jul 2018, at 17:41, Tom Eulenfeld <tom.eulenfeld at uni-jena.de> wrote:
>>
>> Dear Blaž,
>>
>> sorry, I do not have a python 2.7 version of the module.
>>
>> I attempted to create a 2.7 version with 3to2 conversion package. Unfortunately, I am not able to get yam running on python2.7 within a reasonable time.
>>
>> Anyway I do not know if the software is suitable for a cluster, because it uses the multiprocessing module for parallelization. Or do you want to run yam on each node on a subset of data?
>>
>> I have not much experience with cluster-based processing. Maybe it would be possible to install python3 in a conda environment in your user directory?
>>
>> Best regards!
>> Tom
>>
>>
>>
>>> On 17.07.2018 14:59, Blaž Vičič wrote:
>>> Dear Tom.
>>> I wanted to use your package YAM on our cluster, since I am dealing with quite a big dataset. Sadly, we still dont use python3 on the cluster, so I guess this is an issue with your module.
>>> Do you, by any chance have a py27 versin of the module?
>>> Thanks
>>> Blaz