[yam] multi-core problem
Tom Eulenfeld
tom.eulenfeld at uni-jena.de
Tue Mar 6 16:49:27 CET 2018
Hi Weijun,
I am also writing to the mailing list. Maybe others face similar
problems in the future.
Yes, the output is not very helpful.
I've seen that you run Python 3.6.1 and I found this bug which might be
related:
https://bugs.python.org/issue28699
Can you try to upgrade your Python installation? I suggest to use
Anaconda. This probably will not fix the failure, but it might resolve
the dead lock and give a more meaningful error message.
Cheers!
Tom
On 06.03.2018 15:07, Weijun Wang wrote:
> Hi, Tom,
>
> I am not sure which line I should send to you, so copy all the outputs to you. Sorry it looks like still no useful information.
>
> Thanks,
>
> Weijun.
>
> __________________
>
> (obspy) [wwj at t570 yam_test]$ yam-runtests -v
...
>> yam correlate 1 -vvv
> CLI tests passed: 35%|██████████████████████████████████████████████████████▊ | 26/74 [00:20<00:19, 2.43it/s]
> ***CTRL+C here***
>
> CLI tests passed: 36%|████████████████████████████████████████████████████████▉ | 27/74 [03:48<17:41, 22.58s/it]Traceback (most recent call last):
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/multiprocessing/pool.py", line 684, in next
> item = self._items.popleft()
> IndexError: pop from an empty deque
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
> File "/home/wwj/anaconda3/envs/obspy/bin/yam-runtests", line 11, in <module>
> load_entry_point('yam', 'console_scripts', 'yam-runtests')()
> File "/home/wwj/old/gits/obspy/yam/yam/tests/__init__.py", line 27, in run
> ret = not runner.run(suite).wasSuccessful()
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/runner.py", line 176, in run
> test(result)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 122, in run
> test(result)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 122, in run
> test(result)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 122, in run
> test(result)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/case.py", line 649, in __call__
> return self.run(*args, **kwds)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/case.py", line 601, in run
> testMethod()
> File "/home/wwj/old/gits/obspy/yam/yam/tests/test_main.py", line 168, in test_cli
> self.out('correlate 1') # takes long
> File "/home/wwj/old/gits/obspy/yam/yam/tests/test_main.py", line 82, in out
> self.script(cmd.split())
> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 388, in run_cmdline
> run(**args)
> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 147, in run
> run2(command, **args)
> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 211, in run2
> yam.commands.start_correlate(io, **args)
> File "/home/wwj/old/gits/obspy/yam/yam/commands.py", line 167, in start_correlate
> total=len(tasks)):
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/site-packages/tqdm/_tqdm.py", line 959, in __iter__
> for obj in iterable:
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/multiprocessing/pool.py", line 688, in next
> self._cond.wait(timeout)
> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/threading.py", line 295, in wait
> waiter.acquire()
> KeyboardInterrupt
>
>
>
>> -----原始邮件-----
>> 发件人: "Tom Eulenfeld" <tom.eulenfeld at uni-jena.de>
>> 发送时间: 2018-03-06 21:00:48 (星期二)
>> 收件人: "Weijun Wang" <wjwang at cea-ies.ac.cn>
>> 抄送:
>> 主题: Re: [yam] multi-core problem
>>
>> Hi Weijun,
>>
>> good to hear that it is at least working for a single core.
>>
>> Unfortunately, I cannot reproduce your error. I think the child process
>> is dying somehow. Can you please post the last view lines of
>> yam-runtests -v
>>
>> I think I need to add more debug statements in the code to find the bug.
>>
>> Cheers!
>> Tom
>>
>>
>> On 06.03.2018 11:58, Weijun Wang wrote:
>>>
>>> Hi, Tom,
>>>
>>> Sorry I got your name wrong at my first email.
>>>
>>> the enviroments I run are:
>>>
>>> OS: CentOS Linux release 7.4.1708 (Core)
>>> Python: 3.6.1
>>> obspy: 1.1.0 py36_1 conda-forge
>>> obspyh5: 0.3.2 <pip>
>>> yam: 0.3.1-dev
>>>
>>>
>>> yes,the error messages I posted before were come from running the demo notebooks( notebooks yam_velocity_variations_patcx ) .
>>> yam-runtests got stuck at somewhere, such as:
>>> -----------------------------------
>>> (obspy) [wwj at t570 yam_test]$ yam-runtests
>>> CLI tests passed: 32%|██████████████████████████████████████████████████▌ | 24/74 [00:17<00:38, 1.30it/s]
>>> -----------------------------------
>>> and will never continue, when I ctrl+c, will get:
>>> -------------------------------------
>>> Traceback (most recent call last):
>>> File "/home/wwj/anaconda3/envs/obspy/bin/yam-runtests", line 11, in <module>
>>> load_entry_point('yam', 'console_scripts', 'yam-runtests')()
>>> File "/home/wwj/old/gits/obspy/yam/yam/tests/__init__.py", line 27, in run
>>> ret = not runner.run(suite).wasSuccessful()
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/runner.py", line 176, in run
>>> test(result)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 84, in __call__
>>> return self.run(*args, **kwds)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 122, in run
>>> test(result)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 84, in __call__
>>> return self.run(*args, **kwds)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 122, in run
>>> test(result)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 84, in __call__
>>> return self.run(*args, **kwds)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/suite.py", line 122, in run
>>> test(result)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/case.py", line 649, in __call__
>>> return self.run(*args, **kwds)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/unittest/case.py", line 601, in run
>>> testMethod()
>>> File "/home/wwj/old/gits/obspy/yam/yam/tests/test_main.py", line 168, in test_cli
>>> self.out('correlate 1') # takes long
>>> File "/home/wwj/old/gits/obspy/yam/yam/tests/test_main.py", line 82, in out
>>> self.script(cmd.split())
>>> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 388, in run_cmdline
>>> run(**args)
>>> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 147, in run
>>> run2(command, **args)
>>> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 211, in run2
>>> yam.commands.start_correlate(io, **args)
>>> File "/home/wwj/old/gits/obspy/yam/yam/commands.py", line 167, in start_correlate
>>> total=len(tasks)):
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/site-packages/tqdm/_tqdm.py", line 959, in __iter__
>>> for obj in iterable:
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/multiprocessing/pool.py", line 688, in next
>>> self._cond.wait(timeout)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/threading.py", line 295, in wait
>>> waiter.acquire()
>>> KeyboardInterrupt
>>> ^CError in atexit._run_exitfuncs:
>>> Traceback (most recent call last):
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/multiprocessing/util.py", line 254, in _run_finalizers
>>> finalizer()
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/multiprocessing/util.py", line 186, in __call__
>>> res = self._callback(*self._args, **self._kwargs)
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/multiprocessing/pool.py", line 535, in _terminate_pool
>>> cls._help_stuff_finish(inqueue, task_handler, len(pool))
>>> File "/home/wwj/anaconda3/envs/obspy/lib/python3.6/multiprocessing/pool.py", line 520, in _help_stuff_finish
>>> inqueue._rlock.acquire()
>>> KeyboardInterruptij
>>> ----------------------------------
>>>
>>> thanks,
>>>
>>> Weijun.
>>>
>>>
>>>
>>>> -----原始邮件-----
>>>> 发件人: "Tom Eulenfeld" <tom.eulenfeld at uni-jena.de>
>>>> 发送时间: 2018-03-06 18:14:51 (星期二)
>>>> 收件人: seistools at listserv.uni-jena.de
>>>> 抄送: wjwang at cea-ies.ac.cn
>>>> 主题: Re: [yam] multi-core problem
>>>>
>>>> Hello Weijun,
>>>>
>>>> sorry, your mail got somehow lost by the Mailman instance. I attach it
>>>> below.
>>>>
>>>> Regarding your problem:
>>>>
>>>> 1. Did you run yam-runtests? Does it show the same error? Which
>>>> operating system are you using?
>>>> 2. Is your installation up to date? Check yam --version. The latest
>>>> version is 0.3.0.
>>>> 3. If you are already on the latest version. Can you try out the
>>>> development version of yam? You can install dev with
>>>>
>>>> pip install https://github.com/trichter/yam/archive/master.zip
>>>>
>>>> Recently, I reworked how things are written to the HDF5 file. In version
>>>> 0.3.0 and prior versions an extra process was spanned just for writing
>>>> into HDF5 files to circumvent the concurrent writing problem. In the dev
>>>> version writing is done from the main process which is simpler and less
>>>> error prone.
>>>>
>>>> Best,
>>>> Tom
>>>>
>>>>
>>>>
>>>> -------- Forwarded Message --------
>>>>
>>>> Hello, Yawar,
>>>> When I run yam with multi-core, errors frequently appear as a example
>>>> following. It should be the problem about concurrent writting to hdf5
>>>> file in commands.py. I am not familar with hdf5, so don't know whether
>>>> the website( http://docs.h5py.org/en/latest/swmr.html) and
>>>> "Multiprocess concurrent write and read" segment can help.
>>>> Thanks,
>>>>
>>>> -----------------------------------------
>>>>
>>>> $ yam correlate 1b
>>>>
>>>> --------------------error message--------------------------------
>>>> 20%|████████▌ | 75/366 [02:52<11:08,
>>>> 2.30s/it]Traceback (most recent call last):
>>>> File
>>>> "/home/wwj/anaconda3/envs/obspy/lib/python3.6/site-packages/h5py/_hl/files.py",
>>>> line 111, in make_fid
>>>> fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
>>>> File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
>>>> File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
>>>> File "h5py/h5f.pyx", line 78, in h5py.h5f.open
>>>> OSError: Unable to open file (unable to lock file, errno = 11, error
>>>> message = 'Resource temporarily unavailable')
>>>>
>>>> During handling of the above exception, another exception occurred:
>>>>
>>>> Traceback (most recent call last):
>>>> File "/home/wwj/anaconda3/envs/obspy/bin/yam", line 11, in <module>
>>>> load_entry_point('yam', 'console_scripts', 'yam')()
>>>> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 388, in run_cmdline
>>>> run(**args)
>>>> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 147, in run
>>>> run2(command, **args)
>>>> File "/home/wwj/old/gits/obspy/yam/yam/main.py", line 211, in run2
>>>> yam.commands.start_correlate(io, **args)
>>>> File "/home/wwj/old/gits/obspy/yam/yam/commands.py", line 168, in
>>>> start_correlate
>>>> _write_stream(result)
>>>> File "/home/wwj/old/gits/obspy/yam/yam/commands.py", line 156, in
>>>> _write_stream
>>>> result[key].write(io[key], 'H5', mode='a')
>>>> File
>>>> "/home/wwj/anaconda3/envs/obspy/lib/python3.6/site-packages/obspy/core/stream.py",
>>>> line 1443, in write
>>>> write_format(self, filename, **kwargs)
>>>> File
>>>> "/home/wwj/anaconda3/envs/obspy/lib/python3.6/site-packages/obspyh5.py",
>>>> line 186, in writeh5
>>>> with h5py.File(fname, mode, libver='latest') as f:
>>>> File
>>>> "/home/wwj/anaconda3/envs/obspy/lib/python3.6/site-packages/h5py/_hl/files.py",
>>>> line 269, in __init__
>>>> fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
>>>> File
>>>> "/home/wwj/anaconda3/envs/obspy/lib/python3.6/site-packages/h5py/_hl/files.py",
>>>> line 113, in make_fid
>>>> fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
>>>> File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
>>>> File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
>>>> File "h5py/h5f.pyx", line 98, in h5py.h5f.create
>>>> OSError: Unable to create file (unable to open file: name = 'corr.h5',
>>>> errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
>>>>
>>>>
>>>> --
>>>> Weijun Wang
>>>>
>>>> Institute of Earthquake Forecasting, China Earthquake Administration
>>>> Beijing, China
--
Dr. Tom Eulenfeld
Institute for Geosciences
Friedrich-Schiller-University Jena
More information about the seistools
mailing list