-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
duplicate task in each work. #3270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
They don't seem to be duplicates as they have different task ids? |
thanks for your reply, It is a fact that the same data (/tmp/__36) send to two works . I think the one task don't send ack in time, and so the rabbitmq send the this task to another one? |
I have the same issue, I can see on the beat log that the task is only sent once but received twice by the worker a few seconds apart. When using I'm using Celery 3.1.23 (Cipater) with Redis 3.2.3. |
Same Issue, but taskid is the same @ask @DurandA. Celery 3.1.23, Rabbit 3.6.5 [2016-10-07 12:08:25,875: INFO/MainProcess] Received task: recrepli.tasks.replicate[7c508306-772f-464e-bb7c-103bc1dca784] [2016-10-07 12:08:26,587: DEBUG/MainProcess] on_sucess() for task recrepli.tasks.replicate[7c508306-772f-464e-bb7c-103bc1dca784] return '0' [2016-10-07 12:08:27,504: INFO/MainProcess] Received task: recrepli.tasks.replicate[7c508306-772f-464e-bb7c-103bc1dca784] [2016-10-07 12:08:40,286: DEBUG/MainProcess] on_sucess() for task recrepli.tasks.replicate[7c508306-772f-464e-bb7c-103bc1dca784] return '1' |
@ask I'am also attaching a log where yo can see that the task is duplicated and executed in parallel [2016-10-07 12:13:25,966: INFO/MainProcess] Received task: recrepli.tasks.replicate[8935556a-f6c5-4e0e-8709-d5958e437e08] |
Hi @ask dismiss my assertion in this issue Carlos |
Hi all We've found same issue in several our projects when we used Redis as broker but seems it's not broker related problem. It's really looks like (we've checked the logs) any delayed task (ETA used, for example with We using this our snippet to prevent such task behavior: import datetime
from celery import Task
from celery.utils.log import get_task_logger
from django.conf import settings
from django.core.cache import get_cache
logger = get_task_logger(__name__)
# noinspection PyAbstractClass
class TaskWithLock(Task):
"""
Base task with lock to prevent multiple execution of tasks with ETA.
It's happens with multiple workers for tasks with any delay (countdown, ETA).
You may override cache backend by setting `CELERY_TASK_LOCK_CACHE` in your Django settings file
"""
abstract = True
lock_expiration = 60 * 60 * 24 # 1 day
cache = get_cache(getattr(settings, 'CELERY_TASK_LOCK_CACHE', 'default'))
@property
def lock_key(self):
"""Unique string for task as lock key"""
return '%s_%s' % (self.__class__.__name__, self.request.id)
def acquire_lock(self):
"""Set lock"""
result = self.cache.add(self.lock_key, True, self.lock_expiration)
logger.debug('Lock for %s at %s %s', self.request.id, datetime.datetime.now(), 'succeed' if result else 'failed')
return result
def release_lock(self):
"""Release lock"""
result = self.cache.delete(self.lock_key)
logger.debug('Lock release for %s at %s %s', self.request.id, datetime.datetime.now(), 'succeed' if result else 'failed')
return result
def retry(self, *args, **kwargs):
"""We need to release our lock to let the first process take current task in execution on retry"""
logger.debug('Retry requested %s, %s', args, kwargs)
self.release_lock()
return super(TaskWithLock, self).retry(*args, **kwargs)
def __call__(self, *args, **kwargs):
"""Checking for lock existence"""
if self.acquire_lock():
logger.debug('Task %s execution with lock started', self.request.id)
return super(TaskWithLock, self).__call__(*args, **kwargs)
logger.debug('Task %s skipped due lock detection', self.request.id) Feel free to test and point to any errors in this snippet. |
@Skyross I've been investigating a similar issue this morning as we are using Redis as our broker too. How long are your ETAs set to? We have multiple tasks with ETAs set anywhere in between 1-3 days. I suspect our issue (maybe yours as well) could possibly be this: http://docs.celeryproject.org/en/latest/getting-started/brokers/redis.html#redis-visibility-timeout |
@andynguyen11, I am not familiar with Redis but duplicate tasks happen on other brokers for various reasons (see #2976) and this solution would help. @Skyross - Thanks for this. I see your solution uses django cache, which works for me but I am thinking if we can find a more generic solution. How about checking the list of existing scheduled tasks, which one can retrieve by using Skip if the task is already scheduled e.g.
Also, are you using this for workflows or single tasks ? |
Hi all again, The previous version of my code contains at least one error: all tasks with Also I created public repository for that sandbox purpose (maybe later I'll move it to GitHub later if needed) @andynguyen11 Yes, you were right! I came to the same reason. Anyway, sometimes we need to create delay further in the future than the @topalhan Feel free to provide your version and compare performance 👍 UPD: moved repo to https://gitlab.com/the-past/sandbox/Django-1.8.13-Celery-3.1.23 |
what's it's situation after 4.1/master? |
Hi! I also encountered duplication problem.
|
having the same problem here, duplicate task_id sent to multiple workers within 10 secs. Is there any existing solution? Moreover, in my situation the task has no countdown. |
I'm having the same issue duplicate task with duplicate task on redis ...
hope it will help some people
EDIT i've forget the return on the wrapper part now it's corrected |
can this be added in docs? or any patch in upstream? |
If you are speaking about my code |
so here is a version that is not hardcoded (but it still work only with redis):
i've forgot to say it but actualy it work only on python3 (if you want python 2 support remove the ':str' on the definition of run_only_one_instance) |
would you mind sending a PR on master? |
done but first time I do a PR. |
may I know its update with celery==4.4.0 |
I still observer this issue with celery 4.4.0. I will check if I can reproduce this issue with a test case or some simpler reproduce steps. Everything is fine with
We notice this issue after doing the upgrade:
and the issue persists with
Update on 3/3: Update on 3/6:
It seems like the issue happens after upgrade kombu from 4.2.2.post1 to 4.3.0. Used
Alright. My case already be mentioned in celery/kombu#1098. |
@auvipy @thedrow
The calling code:
Celery log:
You need to run the celery worker before calling the task and you must leave it running until the task runs. |
@noamkush Interesting to see the timestamps, looks like the task is accepted first and then maybe re-scheduled after 30 minutes? And then executed twice? |
fixes are welcome |
So I figured out this is not an actual bug but SQS behavior. 30 minutes is celery's default visibility timeout and the message was simply redelivered to celery. Which brings me to the following:
|
both can be improved :) looking forward to your PR |
can you check this celery/kombu#1098 (comment) |
The task is received from sqs but isn't acked since there's a countdown. So celery waits till countdown is over but doesn't ack and meanwhile the visibility timeout expires and the task is redelivered (this is actually noted in caveats). What I suggested is that the celery process that is waiting for countdown would extend the visibility timeout so it won't be redelivered. |
If you have any insight for this case. How to solve it? I got the same issue. I only have one job and it is accepted on one instance and |
Thank you ! I am using
|
that is not a supported version. |
|
celery: 3.1.23 (Cipater)
rabbitmq-server:3.2.4-1
i found some task send to rabbitmq and then celery worker get duplicate task .
function:
def do_backup(name):
....
for example:
server: do_backup.delay("/tmp/__36")
worker02:
[2016-06-23 20:38:27,182: INFO/MainProcess] Received task: tasks.do_backup[40159186-1b88-4d98-aabf-839bb4b7a852]
[2016-06-23 20:38:27,183: INFO/MainProcess] tasks.do_backup[40159186-1b88-4d98-aabf-839bb4b7a852]: get file name: /tmp/__36
worker12:
[2016-06-23 20:36:21,647: INFO/MainProcess] Received task: tasks.do_backup[d265ffba-fa21-48b6-baad-2c0185a3168f]
........
[2016-06-23 20:37:01,492: INFO/MainProcess] tasks.do_backup[d265ffba-fa21-48b6-baad-2c0185a3168f]: get file name: /tmp/__36
[2016-06-23 20:37:01,554: INFO/MainProcess] tasks.do_backup[d265ffba-fa21-48b6-baad-2c0185a3168f]: do task
i make sure the server only send a "/tmp/__36"
The text was updated successfully, but these errors were encountered: