c# – 双核性能比单核差?

前端之家收集整理的这篇文章主要介绍了c# – 双核性能比单核差?前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
以下nunit测试比较了运行单个线程与在双核机器上运行2个线程之间的性能.具体来说,这是一台运行在四核 Linux SLED主机上的VMWare双核虚拟Windows 7计算机,戴尔Inspiron 503.

每个线程简单地循环并递增2个计数器,addCounter和readCounter.该测试是对Queue实现的原始测试,该实现被发现在多核机器上表现更差.因此,在将问题缩小到可重复性较小的代码时,您在这里没有只增加变量的队列,并且令人震惊和沮丧,因为2线程然后一个线程慢得多.

运行第一个测试时,任务管理器显示其中一个核心100%忙于另一个核心几乎空闲.这是单线程测试的测试输出

readCounter 360687000
readCounter2 0
total readCounter 360687000
addCounter 360687000
addCounter2 0

你看到超过3.6亿增量!

接下来,双线程测试显示在整个5秒的测试持续时间内两个内核100%忙碌.但它的输出显示

readCounter 88687000
readCounter2 134606500
totoal readCounter 223293500
addCounter 88687000
addCounter2 67303250
addFailure0

这只是2.23亿读取增量.什么是上帝的创造是那些2 cpu在5秒钟内完成的工作少了?

任何可能的线索?你可以在你的机器上运行测试,看看你是否得到不同的结果?一个想法是,VMWare双核性能可能不是您所希望的.

using System;
using System.Threading;
using NUnit.Framework;

namespace TickZoom.Utilities.TickZoom.Utilities
{
    [TestFixture]
    public class ActiveMultiQueueTest
    {
        private volatile bool stopThread = false;
        private Exception threadException;
        private long addCounter;
        private long readCounter;
        private long addCounter2;
        private long readCounter2;
        private long addFailureCounter;

        [SetUp]
        public void Setup()
        {
            stopThread = false;
            addCounter = 0;
            readCounter = 0;
            addCounter2 = 0;
            readCounter2 = 0;
        }


        [Test]
        public void TestSingleCoreSpeed()
        {
            var speedThread = new Thread(SpeedTestLoop);
            speedThread.Name = "1st Core Speed Test";
            speedThread.Start();
            Thread.Sleep(5000);
            stopThread = true;
            speedThread.Join();
            if (threadException != null)
            {
                throw new Exception("Thread Failed: ",threadException);
            }
            Console.Out.WriteLine("readCounter " + readCounter);
            Console.Out.WriteLine("readCounter2 " + readCounter2);
            Console.Out.WriteLine("total readCounter " + (readCounter + readCounter2));
            Console.Out.WriteLine("addCounter " + addCounter);
            Console.Out.WriteLine("addCounter2 " + addCounter2);
        }

        [Test]
        public void TestDualCoreSpeed()
        {
            var speedThread1 = new Thread(SpeedTestLoop);
            speedThread1.Name = "Speed Test 1";
            var speedThread2 = new Thread(SpeedTestLoop2);
            speedThread2.Name = "Speed Test 2";
            speedThread1.Start();
            speedThread2.Start();
            Thread.Sleep(5000);
            stopThread = true;
            speedThread1.Join();
            speedThread2.Join();
            if (threadException != null)
            {
                throw new Exception("Thread Failed: ",threadException);
            }
            Console.Out.WriteLine("readCounter " + readCounter);
            Console.Out.WriteLine("readCounter2 " + readCounter2);
            Console.Out.WriteLine("totoal readCounter " + (readCounter + readCounter2));
            Console.Out.WriteLine("addCounter " + addCounter);
            Console.Out.WriteLine("addCounter2 " + addCounter2);
            Console.Out.WriteLine("addFailure" + addFailureCounter);
        }

        private void SpeedTestLoop()
        {
            try
            {
                while (!stopThread)
                {
                    for (var i = 0; i < 500; i++)
                    {
                        ++addCounter;
                    }
                    for (var i = 0; i < 500; i++)
                    {
                        readCounter++;
                    }
                }
            }
            catch (Exception ex)
            {
                threadException = ex;
            }
        }

        private void SpeedTestLoop2()
        {
            try
            {
                while (!stopThread)
                {
                    for (var i = 0; i < 500; i++)
                    {
                        ++addCounter2;
                        i++;
                    }
                    for (var i = 0; i < 500; i++)
                    {
                        readCounter2++;
                    }
                }
            }
            catch (Exception ex)
            {
                threadException = ex;
            }
        }


    }
}

编辑:我在没有vmware的四核笔记本电脑上测试了上述内容,并获得了类似的降级性能.所以我写了另一个类似于上面的测试,但它在每个单独的类中都有每个线程方法.我这样做的目的是测试4个核心.

那个测试显示出优异的结果,几乎与1,2,3或4个核心线性地改善.

现在在两台机器上进行了一些实验,似乎只有在主线程方法位于不同实例而不是同一实例上时才会发生正确的性能.

换句话说,如果在特定类的同一个实例上有多个线程主入口方法,那么对于你添加的每个线程,多核上的性能会更差,而不是像你想象的那样更好.

几乎看起来CLR正在“同步”,因此一次只能有一个线程在该方法上运行.但是,我的测试表明情况并非如此.所以目前还不清楚发生了什么.

但是我自己的问题似乎只是通过将运行线程的方法的单独实例作为起点来解决.

此致
韦恩

编辑:

这是一个更新的单元测试,测试1,3和& 4个线程与它们都在同一个类的实例上.使用带变量的数组在线程循环中使用至少10个元素.对于添加的每个线程,性能仍然显着降低.

using System;
using System.Threading;
using NUnit.Framework;

namespace TickZoom.Utilities.TickZoom.Utilities
{
    [TestFixture]
    public class MultiCoreSameClassTest
    {
        private ThreadTester threadTester;
        public class ThreadTester
        {
            private Thread[] speedThread = new Thread[400];
            private long[] addCounter = new long[400];
            private long[] readCounter = new long[400];
            private bool[] stopThread = new bool[400];
            internal Exception threadException;
            private int count;

            public ThreadTester(int count)
            {
                for( var i=0; i<speedThread.Length; i+=10)
                {
                    speedThread[i] = new Thread(SpeedTestLoop);
                }
                this.count = count;
            }

            public void Run()
            {
                for (var i = 0; i < count*10; i+=10)
                {
                    speedThread[i].Start(i);
                }
            }

            public void Stop()
            {
                for (var i = 0; i < stopThread.Length; i+=10 )
                {
                    stopThread[i] = true;
                }
                for (var i = 0; i < count * 10; i += 10)
                {
                    speedThread[i].Join();
                }
                if (threadException != null)
                {
                    throw new Exception("Thread Failed: ",threadException);
                }
            }

            public void Output()
            {
                var readSum = 0L;
                var addSum = 0L;
                for (var i = 0; i < count; i++)
                {
                    readSum += readCounter[i];
                    addSum += addCounter[i];
                }
                Console.Out.WriteLine("Thread readCounter " + readSum + ",addCounter " + addSum);
            }

            private void SpeedTestLoop(object indexarg)
            {
                var index = (int) indexarg;
                try
                {
                    while (!stopThread[index*10])
                    {
                        for (var i = 0; i < 500; i++)
                        {
                            ++addCounter[index*10];
                        }
                        for (var i = 0; i < 500; i++)
                        {
                            ++readCounter[index*10];
                        }
                    }
                }
                catch (Exception ex)
                {
                    threadException = ex;
                }
            }
        }

        [SetUp]
        public void Setup()
        {
        }


        [Test]
        public void SingleCoreTest()
        {
            TestCores(1);
        }

        [Test]
        public void DualCoreTest()
        {
            TestCores(2);
        }

        [Test]
        public void TriCoreTest()
        {
            TestCores(3);
        }

        [Test]
        public void QuadCoreTest()
        {
            TestCores(4);
        }

        public void TestCores(int numCores)
        {
            threadTester = new ThreadTester(numCores);
            threadTester.Run();
            Thread.Sleep(5000);
            threadTester.Stop();
            threadTester.Output();
        }
    }
}

解决方法

That’s only 223 Million read increments. What is god’s creation are those 2 cpu’s doing for those 5 seconds to get less work done?

您可能正在遇到缓存争用 – 当单个cpu递增整数时,它可以在自己的L1缓存中执行此操作,但只要两个cpu在相同的值上开始“战斗”,它所在的缓存行就会有每次访问它们时,在它们的缓存之间来回复制.在缓存之间复制数据所花费的额外时间加起来很快,特别是当您正在进行的操作(递增整数)非常简单时.

猜你在找的C#相关文章