Thursday, September 10, 2009

Opinion: It's C# faster than C++?

I will write my point starting from another blog post of one Java enterprise solution provider. In short it states this: for medium/large programs Java is fast(er) than C/C++.

Before stating any comparison between any platforms, we must understand how works VM programs compared with compiled programs. C# and Java are programs that targets a VM (Virtual Machine) and C++ creates an executable. The main difference between those two is that the VM instructions are not the processor instructions, but are some abstract, independent codes. Those are smaller and easier to compile but they are not processor instructions.

In contrast how it is a C++ application made: a developer writes his code in C++ (or C), there is a compiler that understand what he/she writes, it creates an application that calls from operating system various functionality and for code that developer writes the CPU instructions. The compile time may take some time, like a full blown application like OpenOffice may take more than 10 hours on a 2GHz+ CPU.

How a Java/C# application is made: a compiler does simply translates in simple statements that are much higher level than instructions and makes a bytecode application. It is named .class or .jar in Java (Jar is a zip archive that contains more .class files) or .exe or .dll that contains MSIL code. At running time, a runtime, named JIT, for only functions that are called by your program will make the corresponding machine instructions. Also, a difference is that most JITs come with automatical memory management compared when C++ does not. There are still C++ garbage collectors still. The garbage collector have the big advantages of cache locality and really cheap allocation of small objects, but have the disadvantage of pauses when the automatically memory cleanup occurs.

Judging technologically, a JIT, should get better performance than a static compiler like C++. Some of the reasons are: it knows the target machine, so it can align data and rearange structs in memory, it can use paralel instructions that the host machine supports, just because only the used code is translated in instructions, for large frameworks will mean that it will get better cache locality, it can inline for specific cases functions that are in other libraries as they are bytecode instructions, not effective code. Also, the garbage collector should bring an almost cost-free allocation and better cache locality. The dynamic nature of JIT it gives for it equal or more informations than any static compiler may get.

In practice anyway, I must disagree that a Java/.NET virtual machine can bring the same performance on the same code than a C++. For a specific loop you may get similar performance but using the CPU tunning on new compilers, profile guided optiomizations (which gets dynamic information of how your application behaves and gets almost the same information that a JIT have), the ability to do custom made memory allocator (a common used one for embeded programming is named: circular buffer, which to allocate an item is as cheap as a garbage collector alloc, meaning an pointer add, but without the disadvantage of pauses that garbage collector gives), the bridge from outside world, meaning that the calls to a native operating system, are always made through a translating layer (named JNI or PInvoke) that imply some overhead as marshaling, C++ can be linked with other object files that can be optimized in assembly by CPU makers. Also, the long compiling time it also translates in a lot of more analisys that is done by static compilers to get the latest drop of performance.

The good side is that .NET or Java gives in their field just enough performance. Also, a well written algorithm can get much better performance regardless a language. Just by using Generics, you will get a big performance increase as there is no need for type checking (which means an overhead). Also, generic collections and LinQ will offer to you a good performance. The garbage collector does not guarantee to not have memory leaks, but guarantees to you to not have double deleted pointers or invalid pointers at all. When you access a pointer is either null or a data of the type you've specified. Where C#/.NET or Java are strong are mostly: integer operations, inlining function calls and constant propagation over a satelite/external assembies, the string operations, fast heap allocations, after startup they run enough fast that you will not notice that is an bytecode application.

The last thing that is important regarding speed: sometimes the speed is measured wrong. NaroCAD shows in profiling that it's abstraction layer is enough easy to work and the slowest thing is for now not the JIT or the runtime, but the C++/OpenCascade code. Not because OpenCascade is slow, but because OpenCascade as C++ code did not achieved yet as it has a complex codebase too much time to be optimized.

There is only one point I want it to make regarding speed. On a cheap laptop of those days you can make a clean build of NaroCAD that have more than 800 classes in 17 seconds and to build after a change in 3 seconds, that mostly imples to copy assemblies to a destination folder. At startup it loads all OpenCascade libraries, Windows dlls like UI ones make JIT for anything it needs like the component framework will take 10 seconds. From it, JIT is somelike 5-6 seconds to compile almost all NaroCAD, so as long as you will use more functionality, you will have over time around 2-3 seconds of extra jitt-ing, means a 0.1 delay on every operation only for the first time. If you don't want to wait for those 5 seconds and to improve dramatically the startup speed of NaroCAD, you can add a command like that using NGen: run cmd as administrator and from the folder of NaroCAD write: C:\Windows\Microsoft.NET\Framework\v2.0.50727\ngen.exe install narocad.exe

11 comments:

TJ Bandrowsky said...

I've been doing some benchmarking and disassembly of C# versus C++ in 64 bit land, and I think C# containers can be faster than STL containers. Check this out:

http://www.treatyist.com/issue1/cpp_vs_csharp_arrays.aspx

Ciprian Khlud said...

Thank you for your answer! Is really very comprehensive and I love it to it's all details! Is really a great lecture and nice piece of information!

Thanks that you clean this up!

Ciprian Khlud said...

You may be interested in MS articles about bounds checking cost in managed and unmanaged (unsafe) code):
http://blogs.msdn.com/clrcodegeneration/archive/2009/08/13/array-bounds-check-elimination-in-the-clr.aspx (for managed) and http://blogs.msdn.com/clrcodegeneration/archive/2008/02/08/performance-implications-of-unmanaged-array-accesses.aspx (unsafe)

Rajendra said...

Hi everyone,

I got problem on builing the NaroCAD`s OCWrapper with MsVS 2010 Express edition(VC++9). The error is C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorlib.dll.
This dll file exists there.
I am trying to use NaroCAD for my OCC project but still unsuccess.
Hope to get some ideas regarding your experiences on this issues.
Thank you.

Ciprian Khlud said...

Hi Rajendra, I honestly did not try to rebuild the wrapper under VS 2010 but this error in MSBuild I think that did not expose the real error.
In VS2008 we encounter problems with building full wrapper as were errors of using too much memory so we use a minimalist wrapper (still enough big to fill all NaroCAD's usages).
Please try as starting point to compile this wrapper which is placed under: narocad\trunk\wrappers\win32\OCWrappers\OCWrappersClassesLink_mostcomplete.vcproj

Hope it answer your question, if not, please detail your error and I will look on it in this days.

Rajendra said...

Hi ciplogic,
I got success on building the \win32\OCWrappers\OCWrappersClassesLink_mostcomplete.vcproj with VS 2008 express.
TKMesh.lib is missing there in input library.
Thank you very much.
like you said, I had memory usage error on building the whole wrapper.
I will try with this wrapper, and if unsuccessful I will be certainily boring you with problems.
again thanks a lot.

Rajendra said...

Hi ciplogic,

I have succeed on builing wrapper of OCWrappersClassesLink_mostcomplete.vcpro.
I would like to ask question relating the OCWrapper.
Is this
OCWrappersClassesLink_mostcomplete.vcproj has full functionality of OpenCascade 6.3?
Hope to get knowledge on the wrappers to use in C#.
Thank you.
Also

Ciprian Khlud said...

No. This NaroWrapper it cover NaroCAD functionality and related classes. As Naro have some functionality, it cover around one half of most common classes.
If you will want to get all classes wrappeed, you camn use Visual Studio Object Browser tool to expand to see all classes or in your VS C# Solution, you can write: var obj = new OCNaroWrappers.<and here will be the full list of classes).
Hope it helps,
Ciprian

Rajendra said...

hi ciplogic,
Is there any means to build the full wrapper? As there is an example, i would like to implement it.
There is written something about cdl files that will be used to generate the wrapper.
I am not liking go with cdl files, if i could be able to see the implementation of makebottle example too.
I think there must be some means to build and use the full wrapper and so it is kept there.
Hope to find some means with your help.
thank you..

bxtrx said...

Hi Rajendra,

There is the trunk\wrappers\OCWrappers.sln project that contains all the wrappers generated, that builds with VS2008. The problem when building that project is that after a while the compiler stops with an out of memory error. Then you have to press build again (!not rebuild, just build) to finalize build (it seems that the compiler has some memory leaking and after it eats all memory it needs to be started again).

Be sure that you have on 32 bits a strong machine with 3Gb RAM.

Rajendra said...

hi bxtrx,
I have tried with building again(!rebuild) with the source. But there came link error dealing with memory issue.

>Finished searching libraries
1>LINK : warning LNK4098: defaultlib 'MSVCRTD' conflicts with use of other libs; use /NODEFAULTLIB:library
1>Searching libraries
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKSTEPAttr.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKSTEPBase.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKernel.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKXSBase.lib:
1> Searching C:\Program Files\Microsoft Visual Studio 9.0\VC\lib\msvcrt.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKLCAF.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKBrep.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKIGES.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKShHealing.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKStep.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKShapeSchema.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKBO.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\FWOSPlugin.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\PTKernel.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKBool.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKCAF.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKCDF.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKDraw.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKFeat.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKFillet.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKG2d.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKG3d.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKGeomAlgo.lib:
1> Searching C:\OpenCASCADE6.3.0\ros\win32\lib\TKGeomBase.lib:
1>LINK : fatal error LNK1102: out of memory
1>Build log was saved at "file://c:\....\NaroCAD\narocad\trunk\wrappers\win32\OCWrappers\Debug\BuildLog.htm"
1>OCWrappersClasses - 1 error(s), 1 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
I think there should not be machine specification problem.
Here I think Link.exe has exceeded the memory limit.
I will try to get further progress and update here too.
Thank you every much everyone here.