.call being skipped when timeScale is very high

avancamp · January 29, 2020

Hi,

In some scenarios, my .call instances are not being invoked when I have set the timeScale of my timeline very high (for instance, 99).

However, I am as of yet completely unable to find a minimal repro for this. Everything I build in CodePen appears to work as expected. My production code which has this issue consists of many complex and deeply nested timelines, so I can only assume that the repro is somehow related to the exact construction of this complex timeline.

Given how much effort it would take for me to somehow convert this actual production code into a full-on CodePen to continue experimenting, I wanted to first ask if there were any well-known gotchas that might be causing this to happen. If so, I will investigate those first. If not, well I guess I need to set aside some hours to continue hunting down a minimal repro.

EDIT: This is on GSAP v3.1.1

Thanks,

Alex

GreenSock · January 29, 2020

Sorry to hear about the trouble, @avancamp. The only thing I can think of that sounds remotely like this was reported recently here: https://github.com/greensock/GSAP/issues/358 and it's very rare.

As mentioned in that thread, it's resolved in the next release which you can preview here: https://s3-us-west-2.amazonaws.com/s.cdpn.io/16327/gsap-latest-beta.min.js

Or if you need something you can npm install: https://s3-us-west-2.amazonaws.com/s.cdpn.io/16327/gsap-beta.tgz

If that doesn't resolve things for you, I can't imagine what the problem could be so yes, a reduced test case will go a loooong way in helping you out. We haven't received any other similar reports, so I don't have much else to go on.

avancamp · January 29, 2020

Thanks for the reply, Jack.

I tried out that 3.1.2 beta, and it unfortunately does not affect my issue. I'm about to head out of town for a week for a business trip, so it might be a while before I can come back to this and put more time into attempting to find a reduced test case.

Thanks,

Alex

avancamp · January 29, 2020

Another clue: the problem is inconsistent. Sometimes, there's no problems at all and everything appears as expected. Other times, it feels like almost every .call is being skipped.

I don't know exactly what this clue means yet, but it is important context.

EDIT: To clarify, I mean that I can change zero code and refresh the page a bunch of times, and the outcome can change.

GreenSock · January 29, 2020

I suspect something else is at play in your code (not sure though). Here's a simple test I ran a bunch of times in a few browsers and it NEVER broke:

let tl = gsap.timeline({repeat:5, onRepeat:reset, onComplete: report}),
	calls = 1000,
	gap = 1 / 6,
	count = 0,
	i;
for (i = 0; i < calls; i++) {
	tl.add(doCall, (i + 1) * gap);
}

tl.timeScale(99);

function doCall() {
	count++;
}

function reset() {
	report(true);
	count = 0;
}

function report(notFinal) {
	if (count !== calls) {
		console.log("PROBLEM! Only", count, "calls instead of", calls);
	} else if (notFinal) {
		console.log("repeating, but so far so good...")
	} else {
		console.log("SUCCESS!");
	}
}

That literally tests 1000 calls embedded in a timeline 6 times (so 6000 total), and not a single one was ever missed, even with a timeScale of 99. So I'm scratching my head here, wondering what other variables there may be in your context.

avancamp · January 29, 2020

Thanks for spending some more time looking into it, I agree that at this point it's on me to just put in the hours to find out exactly what is happening. I'll follow up in a few weeks most likely.

GreenSock · January 30, 2020

I wonder if you're jumping around at all in your timeline and perhaps using seek()? That'd skip any calls between the old and new playhead positions, so maybe that makes it seem like GSAP is skipping them (but it's behaving as expected)? Totally guessing. I'm sure your demo will shed some light on it.

Enjoy your business trip

avancamp · January 30, 2020

More clues from my investigation (I can't sleep so I'm just doing this lol):

When this issue happens, the afflicted timelines will have a progress that is less than 1 when all is said and done. The afflicted timelines are not paused, but their progress still does not advance beyond whatever it gets stuck at. On every run, it seems that the progress of these afflicted timelines gets stuck at a different number. Perhaps of note is that the time value of the afflicted timelines in these scenarios always has a long decimal like 1.188000000000006, 0.7920000000000069, or 0.49500000000001165. The number changes every time I run the test. The timelines often get stuck, but where they get stuck seems to have no consistency,

Is this perhaps another manifestation of this same floating point rounding issue that you fixed elsewhere in the 3.1.2 beta? Is some rounding error preventing some triggers from running that would then prevent these timelines from being fully completed?

This latest round of test was indeed still done on that 3.1.2 beta build, so it alone does not appear to fix whatever problem I'm having (if it is indeed a GSAP problem and not a problem in my own code).

EDIT: So to clarify, the issue isn't that .call is being skipped, it's that the timeline is being left in an unfinished state, despite not being paused.

EDIT 2: I should also state that my timelines do heavy pause manipulation. By that I mean: they are often pausing themselves or their children to perform tasks with an indeterminate runtime, after which they resume themselves. But I have to stress that when this issue happens, timeline.paused() reports that the timelines are not paused anymore, but they still aren't progressing to their end.

avancamp · January 30, 2020

I managed to modify @GreenSock's example test code in a way which makes it intermittently fail. I did this by adding a pause and resume on every 10th run of the doCall method:

See the Pen oNgRmmx by Lange (@Lange) on CodePen

GreenSock · January 31, 2020

Yep, I see the problem and I'll need some time to dig into it and explain why that's happening. It's a very tricky scenario indeed, as I'll show you later.

GreenSock · February 1, 2020

I don't think this is actually a bug. I think it's a logic issue...

Let's imagine an extreme case to illustrate the point, where you've got 60 call()s packed into a 1-second span on a timeline, and then we increase the timeScale() of that timeline to 100 meaning there are now 6000 call()s per second. GSAP will move the playhead forward roughly 60 times per second, meaning that it's gonna jump past 100 of those call()s on EACH update.

On that first update, it'll start firing those callbacks in order but because you've got logic in the doCall() method that tells the timeline to pause() on every 10 calls, it'll only call the first 10 and STOP. So now we've got a backlog of 90 doCall() calls. You've got a requestAnimationFrame() that resumes the timeline almost immediately, so on the next GSAP update it moves the playhead forward again to the point where 200 doCalls() are behind it and it starts ripping through the remaining ones which starts with number 11 (remember the backlog??) Again, there's logic in doCall() that tells it to STOP after 10 more, so technically after this update the playhead is ahead of 200 doCall() calls but it has actually only called the first 20 and there's a backlog of 180!! And so on, and so on...

At some point the playhead reaches the end of the timeline and fires the onComplete (as it should, since the playhead is now at that spot) but your logic in doCall() has prevented a lot of the calls from happening. See the issue?

Of course in the real world things don't play perfectly at 60 ticks per second (depending on processor load). My contrived example above is just to simplify things and understand the concept.

Does that clear things up?

avancamp · February 1, 2020

I have to be honest, I do not understand that explanation haha. I'm very confused.

ZachSaucier · February 1, 2020

In less words, there are too many calls to process because the timeframe to do them in is so small that some don't get processed.

Removing the requestAnimationFrame stuff fixes it:

See the Pen ExaqxpP?editors=1111 by GreenSock (@GreenSock) on CodePen

avancamp · February 1, 2020

Hm. I'm not sure what this means for the real-world code I have which seems to experience this issue. Am I somehow using pause in a way that is unsupported?

ZachSaucier · February 1, 2020

Maybe if you described your actual goal then we would be able to suggest a method that doesn't require requestAnimationFrame.

GreenSock · February 2, 2020

13 hours ago, avancamp said:

I have to be honest, I do not understand that explanation haha. I'm very confused.

I'll try to simplify it further...

When you pause() a timeline, it stops further rendering immediately. So let's say that on a single tick, the playhead moves ahead past 100 calls to doCall() (meaning there are 100 of them inbetween the previous playhead position and the new one). In that single render, it starts calling each of those 100 one-by-one in order, but in your code you've got logic that tells it to pause() the timeline as soon as doCall() is called 10 times! So 90 of them don't get called on that tick even though positionally they're supposed to have been called.

Then on the next tick (once you resume()), it moves the playhead forward to where it should be time-wise, and that means it just moved past ANOTHER 100 calls to doCall(). On that render, it'd begin calling them one-by-one starting with the backlog of 90 from that previous render...but again, you've got code that tells it to STOP (pause()) after only 10. Now we've got a backlog of 180 calls!

See the problem? GSAP is moving the playhead correctly according to the absolute time, but you've got code in place that's preventing the calls from completing! You're building an ever-increasing backlog but never giving it room to execute them all. It's not a bug in GSAP - it's a logic issue in your code.

Does that clarify things at all?

Like @ZachSaucier said, it'd probably help if you gave us a description of your real-world scenario/goal so that maybe we can help brainstorm a better solution.

avancamp · February 2, 2020

I think I see what happened -- I misunderstood the structure of this example test case and my modification does not make sense and is not an accurate representation of the real-world issue I am encountering. I will have a better explanation of my use case and issues in a few days. I am indeed not actually doing any infinite loops or using rAF in my real world code.

avancamp · February 4, 2020

Alright, here is a pretty long and in-depth explanation of what I'm trying to do and why. Apologies for the wall of text:

Background: I make graphics for online video broadcasts using GSAP. I do not make traditional websites, and do not have the same use cases, goals, concerns, or limitations as an actual web dev. Everything I make is in the context of live video broadcast graphics systems.

The specific codebase in question is 5 years old, under active development, and used regularly in production on major broadcasts. It is stable and, in normal use, exhibits no GSAP-related issues. However, many of our animations are extremely complicated, and large refactors of them are an absolute last resort.

Goal: Make all animations on the page finish instantly (or rather, as close to instant as possible), regardless of when they are added, for the purposes of automated screenshot comparison testing.

About the screenshot test system: The test framework loads one of my graphics pages in Puppeteer. It then issues a command to the page to play some animation. It then waits for a hardcoded amount of time for the animations to finish and takes a screenshot of the page, which it compares against a known-good reference. If there is a discrepancy, the test fails.

This system currently works, but is very slow because it just has to wait for all animations to play out in real-time before it can take the screenshot of their end state (the end state is the only thing we screenshot and test for). Running the test suite takes about 8 minutes, and is a major blocker to my daily productivity. This is purely my own fault for making such a flawed system, but is important context as to why I am motivated to solve this problem.

Why I'm not using gsap.globalTimeline.progress(1): It is currently hard for me to know when any given page I am testing is actually done adding things to the timeline. Yes, this is a flaw in my codebase. It is something I am working to resolve, but for legacy reasons will take a long time.

In the meantime, I am trying to speed up my tests using whatever means I have at my disposal. gsap.globalTimeline.timeScale(99) is the most promising short-term solution I have, because it means I can lower this hardcoded wait time substantially, and potentially run the test suite in as little as 1 minute instead of 8 minutes.

Why I’m not just rewriting my graphics to use GSAP in a more idiomatic way: Long-term, that is precisely my goal. However, doing that will be profoundly difficult and slow, given how long it takes me to run my tests. If I am to pull this refactor off, I just need a hacky way of making my tests faster right now, without having to rewrite my animations first.

How the issue manifests: Randomly, when running at a very high timescale, certain child timelines on the page will never reach their end state. They get "stuck" at various points. This issue has never happened in production, which only uses a timescale of 1. It only affects this primitive screenshot testing system.

My timelines often do what I’ve been calling “pause manipulation”, and I’m guessing that it is at the heart of the problem. It works something like this:

Start playing a "parent" animation.
At some point in this animation, the parent generates an indeterminate number of children, each of which plays an animation of indeterminate length.
Because it is not clear how long these child animations will be when they start, my solution in many places in this codebase is to call .pause() on the parent when these children begin running. The children run to completion, and then when they are complete I call .resume() on the parent.
1. Yes, I know this seems like it does not make sense for many reasons, but I have to stress that this is actually rational in the context of these very complex animations and the limitations of the framework around them, and is the best compromise I could make at the time of authoring.

So, in reality, I am calling .pause() and .resume() maybe 1-6 times over the span of a 5-60 second animation that consists of multiple parallel timelines which are orchestrated together.

Summary: I have a large legacy codebase that works well in production. I want to have automated screenshot tests for this codebase, but I don’t want to have to wait for the animations to play out in realtime. In an attempt to fix this, I am running the animations with timeScale(99). But, when I do that, some of the timelines on the page never actually finish and get stuck at seemingly random points. When they are stuck, they are not actually paused, nor are any of their ancestor timelines (all the way up to the root) paused. And yet, they are stuck and not playing.

I hope that this context is useful, and I appreciate the time and attention given towards this vague and niche issue so far.

ZachSaucier · February 4, 2020

Thanks for the description! Interesting use case.

11 minutes ago, avancamp said:

Why I'm not using gsap.globalTimeline.progress(1): It is currently hard for me to know when any given page I am testing is actually done adding things to the timeline. Yes, this is a flaw in my codebase. It is something I am working to resolve, but for legacy reasons will take a long time.

This is the most confusing portion to me. Why are you affecting the global timeline?

The way I would try to structure it in theory (not knowing your code base or how things work exactly):

"It then issues a command to the page to play some animation" - when this happens, load only the relevant pieces and animations. Make sure they're paused.
Change the progress of each to 1.
Take your screenshot.
Clear things out and repeat for however many commands that you have.

Have you tried something along those lines?

avancamp · February 4, 2020

@ZachSaucier While I agree that is ideal, it is much harder in this framework than it should be because of mistakes I made 5 years ago.

Specifically, there is no single entrypoint into most of these animations. These timelines will fork off other independent timelines from various .call handlers, and the event callbacks of these timelines (onStart, onComplete, etc) will pause and resume each other. It is a very dumb and very complex web and there is no single thread to pull on. The simplest thing is just to bang on the global timeline, for now.

Though, in general, I think manipulating the global timeline for the purposes of screenshot testing is pretty ideal, and once these animations are better-written I'll still probably use gsap.globalTimeline.progress(1) because it'll just work everywhere (in theory).

ZachSaucier · February 4, 2020

2 minutes ago, avancamp said:

These timelines will fork off other independent timelines from various .call handlers, and the event callbacks of these timelines (onStart, onComplete, etc) will pause and resume each other.

Ah, I see.

You may already know this, but in general parent timelines should have full control over children timelines and never the opposite way around. A child timeline pausing a parent timeline makes for some weird logic (completely independent from how GSAP works).

avancamp · February 4, 2020

@ZachSaucier Absolutely, which is why I want to get this test suite to be fast using whatever hacks are available to me right now, so that I can rewrite my animations to be more idiomatic without wanting to pull my hair out while waiting for a super slow test suite to run. Faster tests means I can get my code to a more idiomatic implementation much more rapidly.

In short: I know how bad my code is right now, and I'm seeking help implementing this strange timeScale hack so that I can be more efficient in making my code less bad. It is a bandaid solution that will help me get to a real solution.

GreenSock · February 5, 2020

The most confusing thing for me (and I may have misunderstood) was that you have children that pause their own parent...and yet they still play? That shouldn't happen actually, because a child's playhead is controlled by the parent's. So if the parent's playhead stops, it won't sweep over the children, thus they're all essentially paused too. So how'd you get the children to play with the parent being paused?

I'd love to find a solution for you, but it's just super-duper hard without any way of reproducing it. I'm not sure what to tell you, but I suspect you're right about the root of the problem being these pauses/resumes you're doing inside callbacks at super high-speed. Like...maybe you've got a ton of those getting triggered on a single tick, and they're stepping on each other. Remember, when a timeline gets paused inside of a callback, it immediately halts rendering of any other children in that timeline that would normally render at that time.

You said you already tried the latest beta and it didn't resolve things for you, right?

avancamp · February 10, 2020

I was unable to resolve this issue. I resorted to instead spamming `gsap.globalTimeline.progress(1)` on an interval. Something specifically about raising `.timeScale` just makes my timelines misbehave.

GreenSock · February 10, 2020

Please retry the latest beta (I updated it very recently and it applies some rounding to start/end times). I doubt it'll solve things for you, as I suspect there's a logic flaw in the way you were doing things most likely (though it's impossible to tell without any kind of reduced test case). I think that's the best I can do at this point.

.call being skipped when timeScale is very high

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members