The function below is not useful in any way. I hope that keeping this thread will warn others from commiting similar mistakes. Before trusting your own workarounds for efficiency, please think and test like I should have.
As I'm in the process of making a physics library and want to do certain trigonometry without too many expensive function calls, I fell for the idea of trying to make a faster square root function. Hopefully it's faster than regular square root. If not, then it was fun making this function anyway.
Edit: This proves slower than standard Square Root
Edit: Changed it so 1st iteration uses bitshift instead of division. It still needs that last step to gain decent accuracy, but you can remove the last step before return. Removing that will make it very inexact... but still, it'll have pretty much round number precision without it.
Edit: Accuracy Demo (Current)
Number
BlitzSquareRoot
StarCraft 2
A normal calculator
I've capped after 6 digits
28
5.29541
5.29126
5.291502
0.5
0.75
0.707031
0.707106
20
4.472168
4.471924
4.472135
1200
34.641113
34.640869
34.641016
80
8.944336
8.944092
8.944271
6.7
2.589844
2.588379
2.588435
I don't claim this is better than anything in particular. I've never tried this kind of thing before, and don't know if it's is an actual speed improvement over the normal function. Wish I could find out somehow.
The division is the most expensive part of the function I believe. You can copy the last line and repeat it depending on your needs for more precision and some unknown decrease in speed.
Please give feedback and use it for as much good as you can.
Back in WC3, those workarounds usually suffered from the long function call time of Jass. Galaxy still has this problem, so calling 4 functions might be slower than using the default squareroot already.
I don't know under which circumstances I should test or how to track a measurable difference. Can you suggest how I can speedtest this thing? I've tried to assign it a crapload of square root calls and then my function to compare. It's a micro-difference that I can't measure in fps. Testing in that way got silly :)
Makes me wonder if it's worth it, even if we suppose mine was an improvement.
It could be altered to not call anything, though using more multiplications and divisions, and maybe at least use a While loop (is that a function call?)
Finally, consider table lookups with all their issues. To use an array with a size that SC2 will allow, you still need to use one or more Babylon iterations to refine the lookup result. I don't like the lookup workaround, but just wanted to remind you that it's a possibility.
But you're right, we need conclusive tests before it matters what I try to sell you. I could use finding out aswell. Just don't know how.
It's a micro-difference that I can't measure in fps.
That's it.
Speed only matters after the game starts losing fps. So unless you call A TON of Sqrts this is a viable alternative.
And if you call a ton of that then a table with all values predefined is probably a better idea anyway.
Speed only matters after the game starts losing fps. So unless you call A TON of Sqrts this is a viable alternative.
I was only planning on a few kilo, so it's viable. Though, no more viable than native sqrt. Hope you're all happy with my function for less accurate square roots. Use at your leisure, but don't call A TON ;)
Will look for a more useful waste of the 4th dimension time.
A possible method to determine how efficient it is is to make a loop that calls an arbitrary function each loop, as well as iterating a global variable, then a trigger that goes off after would display the global variable. The loop will thread-crash at some point, terminating the loop and the iteration. The higher the displayed value the faster the function is.
From my understanding, the thread stops after a certain amount of micro operations, not after a specific execution time. Afaik, the error message is misleading.
Heard that as well. A better way to compare speeds would be to run a while loop that checks if a certain boolean is true, and uses the sqrt function and modifies an integer by +1, then have a seperate trigger set the boolean to false after 1 second and send a text/debug message, and then see which case has a higher number in the end.
I've compared the way masterxel said. Majority of commenters are right. No matter what optimizations I make, I can't get a custom function to be faster than native square root. I first tried the function as it's written in my first post. SC2's was 44% faster than that! In other words, top post function = slow. I then optimized it to take fixed n and return integer, consequently limiting to 1 function call (+ the function itself), and compared to SC2's Square Root (Integer) instead. SC2's counterpart was still 25% faster.
I tried a lookup table with no custom functions and no precision refinements. This inevitably requires a FixedToInt conversion as you can't lookup with a fixed point number. This seems to run slightly, just a teenie weenie bit faster than Square Root (Integer): Called 4 integer sqrts in Trigger A every 0.0 seconds, and looked up 4 square roots in Trigger B also every 0.0 seconds using same number to find sqrt of. The lookup trigger ran at about 8-9 ms, and square root trigger ran more stabilly at 9 ms. That potential tad of speed ofcourse comes at the cost of a really large array. It should be mentioned that I used FixedToInt conversion for native sqrt (integer) while you don't need to do that. So native sqrt (int) might still be faster if I'd left out the conversion. I didn't think about that when comparing.
Because I was curious, I made a lookup method for Cosine aswell. The speed of this was very similar to the speed of sqrt lookups. And the difference between a table Cosine and native Cosine was the same (i.e. lookup method proved just slightly, slightly faster). I think I remembered not to convert for both functions in this try, so this is more valid info. Try yourself if you want to know for sure.
Conclusion: If anything can work around for speed at all, go for extremely simple lookup table methods. They might win you up to 11% of native counterpart's speed at the very most. I could test everything more thoroughly over what I've allready done, if you want a bigger science report... lol, but I don't feel like it. It's neigh impossible to make speedier workarounds. I'm sorry for stating text walls of things you probably took for granted. Yeah, and don't call functions you don't absolutely need.
The sqrt tests, using 3 periodic triggers, actually caused my almost empty map a big fps drop in single player.
Why are you trying to fix something that inst broken.... I pretty sure their square root function is probably as basic as you can get.
For your physics engine anything below .001 is basically irrelevant...... I know from using a physics engine in game and the distance involved at this point is completely unnoticeable
Did you have some kind of issue or you just didn't like the results of the function? Anyways you will generally be rounding any square root functions to the thousandths anyways...
Why are you trying to fix something that inst broken.... I pretty sure their square root function is probably as basic as you can get.
To answer that very short: I'm a crazy person.
What's basic depends on what level of precision you want. I imagine you could do with precision to 2 decimals. I kind of assumed their square root was more precision oriented than necessary for a physics engine. I'm also very sorry for making such a dry deal out of something this small and soaky.
heady stuff to me .... but looping to find the answer through all possible solutions doesn't sound like an optimal way to do it.....
Eh, all possible sollutions sounds a bit extreme. Notice it counts the number of times a previous value starting at 7 can be quadrupled before going past N. But I don't think it's very efficient the way it's done actually. Other methods I've seen did something to the mantissa in a double or floating point. Not possible in Galaxy.
EDIT: I hope keeping this thread will help others to not repeat the same mistakes as me.
The function below is not useful in any way. I hope that keeping this thread will warn others from commiting similar mistakes. Before trusting your own workarounds for efficiency, please think and test like I should have.
As I'm in the process of making a physics library and want to do certain trigonometry without too many expensive function calls, I fell for the idea of trying to make a faster square root function.
Hopefully it's faster than regular square root. If not, then it was fun making this function anyway.Edit: This proves slower than standard Square RootEdit: Changed it so 1st iteration uses bitshift instead of division. It still needs that last step to gain decent accuracy, but you can remove the last step before return. Removing that will make it very inexact... but still, it'll have pretty much round number precision without it.
Edit: Accuracy Demo (Current)
I don't claim this is better than anything in particular. I've never tried this kind of thing before, and don't know if it's is an actual speed improvement over the normal function. Wish I could find out somehow.The division is the most expensive part of the function I believe. You can copy the last line and repeat it depending on your needs for more precision and some unknown decrease in speed.
Please give feedback
and use it for as much good as you can.Back in WC3, those workarounds usually suffered from the long function call time of Jass. Galaxy still has this problem, so calling 4 functions might be slower than using the default squareroot already.
However, we would need a speedtest.
@Kueken531: Go
I don't know under which circumstances I should test or how to track a measurable difference. Can you suggest how I can speedtest this thing? I've tried to assign it a crapload of square root calls and then my function to compare. It's a micro-difference that I can't measure in fps. Testing in that way got silly :)
Makes me wonder if it's worth it, even if we suppose mine was an improvement.
It could be altered to not call anything, though using more multiplications and divisions, and maybe at least use a While loop (is that a function call?)
Finally, consider table lookups with all their issues. To use an array with a size that SC2 will allow, you still need to use one or more Babylon iterations to refine the lookup result. I don't like the lookup workaround, but just wanted to remind you that it's a possibility.
But you're right, we need conclusive tests before it matters what I try to sell you. I could use finding out aswell. Just don't know how.
That's it.
Speed only matters after the game starts losing fps. So unless you call A TON of Sqrts this is a viable alternative.
And if you call a ton of that then a table with all values predefined is probably a better idea anyway.
General rule of thumb: Natives are faster than solutions in galaxy.
Applies here as well.
I was only planning on a few kilo, so it's viable. Though, no more viable than native sqrt. Hope you're all happy with my function for less accurate square roots. Use at your leisure, but don't call A TON ;)
Will look for a more useful waste of
the 4th dimensiontime.A possible method to determine how efficient it is is to make a loop that calls an arbitrary function each loop, as well as iterating a global variable, then a trigger that goes off after would display the global variable. The loop will thread-crash at some point, terminating the loop and the iteration. The higher the displayed value the faster the function is.
@Ranakastrasz: Go
From my understanding, the thread stops after a certain amount of micro operations, not after a specific execution time. Afaik, the error message is misleading.
@Kueken531: Go
Heard that as well. A better way to compare speeds would be to run a while loop that checks if a certain boolean is true, and uses the sqrt function and modifies an integer by +1, then have a seperate trigger set the boolean to false after 1 second and send a text/debug message, and then see which case has a higher number in the end.
Best way to compare speeds is the trigger debugging window.
File -> Preferences -> Test Document -> Show Trigger Debugging Window
In the Threads tab you can see each trigger's Average Run Time (ms).
@masterxel: Go
Totally forgot about that. That's definitely the best way.
Warning: Wall of Text, big science report:
I've compared the way masterxel said. Majority of commenters are right. No matter what optimizations I make, I can't get a custom function to be faster than native square root. I first tried the function as it's written in my first post. SC2's was 44% faster than that! In other words, top post function = slow. I then optimized it to take fixed n and return integer, consequently limiting to 1 function call (+ the function itself), and compared to SC2's Square Root (Integer) instead. SC2's counterpart was still 25% faster.
I tried a lookup table with no custom functions and no precision refinements. This inevitably requires a FixedToInt conversion as you can't lookup with a fixed point number. This seems to run slightly, just a teenie weenie bit faster than Square Root (Integer): Called 4 integer sqrts in Trigger A every 0.0 seconds, and looked up 4 square roots in Trigger B also every 0.0 seconds using same number to find sqrt of. The lookup trigger ran at about 8-9 ms, and square root trigger ran more stabilly at 9 ms. That potential tad of speed ofcourse comes at the cost of a really large array. It should be mentioned that I used FixedToInt conversion for native sqrt (integer) while you don't need to do that. So native sqrt (int) might still be faster if I'd left out the conversion. I didn't think about that when comparing.
Because I was curious, I made a lookup method for Cosine aswell. The speed of this was very similar to the speed of sqrt lookups. And the difference between a table Cosine and native Cosine was the same (i.e. lookup method proved just slightly, slightly faster). I think I remembered not to convert for both functions in this try, so this is more valid info. Try yourself if you want to know for sure.
Conclusion: If anything can work around for speed at all, go for extremely simple lookup table methods. They might win you up to 11% of native counterpart's speed at the very most. I could test everything more thoroughly over what I've allready done, if you want a bigger science report... lol, but I don't feel like it. It's neigh impossible to make speedier workarounds. I'm sorry for stating text walls of things you probably took for granted. Yeah, and don't call functions you don't absolutely need.
The sqrt tests, using 3 periodic triggers, actually caused my almost empty map a big fps drop in single player.
@HuggetSukker: Go
Why are you trying to fix something that inst broken.... I pretty sure their square root function is probably as basic as you can get.
Did you have some kind of issue or you just didn't like the results of the function? Anyways you will generally be rounding any square root functions to the thousandths anyways...
http://mathworld.wolfram.com/SquareRootAlgorithms.html
heady stuff to me .... but looping to find the answer through all possible solutions doesn't sound like an optimal way to do it.....
as far as I can tell you need to write a function that handles it as a hydradic equation....
some java script examples... these all appear to do some looping ...
http://baris.aydinoglu.info/coding/three-square-root-methods
To answer that very short: I'm a crazy person.
What's basic depends on what level of precision you want. I imagine you could do with precision to 2 decimals. I kind of assumed their square root was more precision oriented than necessary for a physics engine. I'm also very sorry for making such a dry deal out of something this small and soaky.
Eh, all possible sollutions sounds a bit extreme. Notice it counts the number of times a previous value starting at 7 can be quadrupled before going past N. But I don't think it's very efficient the way it's done actually. Other methods I've seen did something to the mantissa in a double or floating point. Not possible in Galaxy.
EDIT: I hope keeping this thread will help others to not repeat the same mistakes as me.