마침내 등장한 인텔 스카이레이크X 7900X, 7820X, 7800X, 카비레이크X 7740 테스트
|Skylake-X Processors (Low Core Count Chips)|
|Core i7-7800X||Core i7-7820X||Core i9-7900X|
|Cores / Threads||6/12||8/16||10/20|
|Base Clock||3.5 GHz||3.6 GHz||3.3 GHz|
|Turbo Clock||4.0 GHz||4.3 GHz||4.3 GHz|
|TurboMax Clock||N/A||4.5 GHz||4.5 GHz|
|L3||8.25 MB||11 MB||13.75 MB|
신형 스카이레이크X 스펙표
7800X : 6코어 12스레드 / 베이스 클럭 3.5 GHz / 부스트 클럭 4.0 GHz / L3 8.25 MB / DDR4-2400
7820X : 8코어 16스레드 / 베이스 클럭 3.6 GHz / 최대 부스트 클럭 4.5 GHz / L3 11 MB / DDR4-2666
7900X : 10코어 20스레드 / 베이스 클럭 3.3 GHz / 최대 부스트 클럭 4.5 GHz / L3 13.75 MB / DDR4-2666
TDP : 140W 공통
PCIe 레인수 : 7800X / 7820X = 28레인, 7900X : 44레인
가격 : 7800X - 389달러 / 7820X - 599달러 / 7900X - 999달러
신형 프로세서 패키지
[ 테스트 시스템 ]
|Processor||Intel Core i9-7900X (10C/20T, 140W, 3.3 GHz) |
Intel Core i7-7820X (8C/20T, 140W, 3.6 GHz)
Intel Core i7-7800X (6C/12T, 140W, 3.5 GHz)
Intel Core i7-7740X (4C/8T, 112W, 4.3 GHz)
Intel Core i5-7640X (4C/4T, 112W, 4.0 GHz)
|Motherboards||ASRock X299 Taichi |
MSI X299 Gaming Pro Carbon
GIGABYTE X299 Gaming 9
|Cooling||Thermalright TRUE Copper|
|Power Supply||Corsair AX760i PSU |
Corsair AX1200i Platinum PSU
|Memory||Corsair Vengeance Pro DDR4-2666 4x8 GB or |
G.Skill Ripjaws 4 DDR4-2666 C15 4x8GB or
TeamGroup NightHawk RGB DDR4-3000
|Video Cards||MSI GTX 1080 Gaming 8GB |
ASUS GTX 1060 Strix
Sapphire R9 Fury 4GB
Sapphire RX 480 8GB
Sapphire RX 460 2GB
|Hard Drive||Crucial MX200 1TB|
|Optical Drive||LG GH22NS50|
|Case||Open Test Bed|
|Operating System||Windows 10 Pro 64-bit|
First up is a self-penned test using a monstrous PDF we once received in advance of attending an event. While the PDF was only a single page, it had so many high-quality layers embedded it was taking north of 15 seconds to open and to gain control on the mid-range notebook I was using at the time. This put it as a great candidate for our 'let's open an obnoxious PDF' test. Here we use Adobe Reader DC, and disable all the update functionality within. The benchmark sets the screen to 1080p, opens the PDF to in fit-to-screen mode, and measures the time from sending the command to open the PDF until it is fully displayed and the user can take control of the software again. The test is repeated ten times, and the average time taken. Results are in milliseconds.
The extra frequency of the new processors is helping when it comes to opening our monster PDF, but also the extra L2 cache is likely having an effect as well.
One of the more interesting workloads that has crossed our desks in recent quarters is FCAT - the tool we use to measure stuttering in gaming due to dropped or runt frames. The FCAT process requires enabling a color-based overlay onto a game, recording the gameplay, and then parsing the video file through the analysis software. The software is mostly single-threaded, however because the video is basically in a raw format, the file size is large and requires moving a lot of data around. For our test, we take a 90-second clip of the Rise of the Tomb Raider benchmark running on a GTX 980 Ti at 1440p, which comes in around 21 GB, and measure the time it takes to process through the visual analysis tool.
FCAT takes in a frame, processes it and dumps it, all on a single thread. The quicker you get through the workload the better, and frequency is supreme, hence we get the 7820X followed by the 7800X then the 7900X. Even though the 7900X has the higher turbo here, the results are with the margin expected.
3D Particle Movement v2.1
This is the latest version of the self-penned 3DPM benchmark. The goal of 3DPM is to simulate semi-optimized scientific algorithms taken directly from my doctorate thesis. Version 2.1 improves over 2.0 by passing the main particle structs by reference rather than by value, and decreasing the amount of double->float->double recasts the compiler was adding in. It affords a ~25% speed-up over v2.0, which means new data.
Give 3DPMv2.1 some cores, and it will show you the world / some numbers. The 1800X and 6950X were gunning for top spot, but the extra frequency of the 7900X wins here.
Despite being a couple of years old, the DigiCortex software is a pet project for the visualization of neuron and synapse activity in the brain. The software comes with a variety of benchmark modes, and we take the small benchmark which runs a 32k neuron/1.8B synapse simulation. The results on the output are given as a fraction of whether the system can simulate in real-time, so anything above a value of one is suitable for real-time work. The benchmark offers a 'no firing synapse' mode, which in essence detects DRAM and bus speed, however we take the firing mode which adds CPU work with every firing.
DigiCortex loves a bit of memory, although when speaking with the developer, there can some instances where the beast needs to be fed. Losing the inclusive L3 might be a factor here, especially with the 7800X all the way down.
Agisoft Photoscan 1.0
Photoscan stays in our benchmark suite from the previous version, however now we are running on Windows 10 so features such as Speed Shift on the latest processors come into play. The concept of Photoscan is translating many 2D images into a 3D model - so the more detailed the images, and the more you have, the better the model. The algorithm has four stages, some single threaded and some multi-threaded, along with some cache/memory dependency in there as well. For some of the more variable threaded workload, features such as Speed Shift and XFR will be able to take advantage of CPU stalls or downtime, giving sizeable speedups on newer microarchitectures.
Photoscan is a mixed bag of operations, going through single thread sections to multithread and a range of cache/memory bandwidth requirepements. There's not much difference between thw 10 core and the 8 core, but the frequency helps against Broadwell-E.
Corona is a standalone package designed to assist software like 3ds Max and Maya with photorealism via ray tracing. It's simple - shoot rays, get pixels. OK, it's more complicated than that, but the benchmark renders a fixed scene six times and offers results in terms of time and rays per second. The official benchmark tables list user submitted results in terms of time, however I feel rays per second is a better metric (in general, scores where higher is better seem to be easier to explain anyway). Corona likes to pile on the threads, so the results end up being very staggered based on thread count.
For a render that has been around for what seems like ages, Blender is still a highly popular tool. We managed to wrap up a standard workload into the February 5 nightly build of Blender and measure the time it takes to render the first frame of the scene. Being one of the bigger open source tools out there, it means both AMD and Intel work actively to help improve the codebase, for better or for worse on their own/each other's microarchitecture.
As a synthetic, LuxMark might come across as somewhat arbitrary as a renderer, given that it's mainly used to test GPUs, but it does offer both an OpenCL and a standard C++ mode. In this instance, aside from seeing the comparison in each coding mode for cores and IPC, we also get to see the difference in performance moving from a C++ based code-stack to an OpenCL one with a CPU as the main host.
Another regular benchmark in most suites, POV-Ray is another ray-tracer but has been around for many years. It just so happens that during the run up to AMD's Ryzen launch, the code base started to get active again with developers making changes to the code and pushing out updates. Our version and benchmarking started just before that was happening, but given time we will see where the POV-Ray code ends up and adjust in due course.
The latest version of CineBench has also become one of those 'used everywhere' benchmarks, particularly as an indicator of single thread performance. High IPC and high frequency gives performance in ST, whereas having good scaling and many cores is where the MT test wins out.
Sunspider goes after peak frequency most of the time, althoguh there is some variation as it moves into basically becoming a legacy test.
Mozilla Kraken 1.1
Kraken is more of an intense attack on JS, and still regularly sorts by IPC and frequency.
Google Octane 2.0
Along with Mozilla, as Google is a major browser developer, having peak JS performance is typically a critical asset when comparing against the other OS developers. In the same way that SunSpider is a very early JS benchmark, and Kraken is a bit newer, Octane aims to be more relevant to real workloads, especially in power constrained devices such as smartphones and tablets.
Octane seems to be an optimization target, and with the new Skylake-X it shows.
While the previous three benchmarks do calculations in the background and represent a score, WebXPRT is designed to be a better interpretation of visual workloads that a professional user might have, such as browser based applications, graphing, image editing, sort/analysis, scientific analysis and financial tools.
HandBrake H264 and HEVC
As mentioned above, video transcoding (both encode and decode) is a hot topic in performance metrics as more and more content is being created. First consideration is the standard in which the video is encoded, which can be lossless or lossy, trade performance for file-size, trade quality for file-size, or all of the above can increase encoding rates to help accelerate decoding rates. Alongside Google's favorite codec, VP9, there are two others that are taking hold: H264, the older codec, is practically everywhere and is designed to be optimized for 1080p video, and HEVC (or H265) that is aimed to provide the same quality as H264 but at a lower file-size (or better quality for the same size). HEVC is important as 4K is streamed over the air, meaning less bits need to be transferred for the same quality content.
Handbrake is a favored tool for transcoding, and so our test regime takes care of three areas.
Low Quality/Resolution H264: He we transcode a 640x266 H264 rip of a 2 hour film, and change the encoding from Main profile to High profile, using the very-fast preset.
More cores, more frequency, more IPC, more fun: the Core i9-7900X wins here, and even the i7-7800X wins out against the Core i7-6900K.
High Quality/Resolution H264: A similar test, but this time we take a ten-minute double 4K (3840x4320) file running at 60 Hz and transcode from Main to High, using the very-fast preset.
Moving into HQ mode means making the job more parallel, so the higher core counts stay at the top of the chart.
HEVC Test: Using the same video in HQ, we change the resolution and codec of the original video from 4K60 in H264 into 4K60 HEVC.
For the 2017 test suite, we move to the latest version of WinRAR in our compression test. WinRAR in some quarters is more user friendly that 7-Zip, hence its inclusion. Rather than use a benchmark mode as we did with 7-Zip, here we take a set of files representative of a generic stack (33 video files in 1.37 GB, 2834 smaller website files in 370 folders in 150 MB) of compressible and incompressible formats. The results shown are the time taken to encode the file. Due to DRAM caching, we run the test 10 times and take the average of the last five runs when the benchmark is in a steady state.
WinRAR loves having access to all the caches as much as possible, to prefetch and store data as needed. The Skylake-X chips fall back a bit here, even with DDR4-2666 support. The Core i7-7800X uses DDR4-2400 memory, so puts it further behind. Interesting didn't realise that the lower core count Broadwell-E chips were affected so much by this test, and the higher core count Ivy Bridge-E parts are faster here.
Algorithms using AES coding have spread far and wide as a ubiquitous tool for encryption. Again, this is another CPU limited test, and modern CPUs have special AES pathways to accelerate their performance. We often see scaling in both frequency and cores with this benchmark. We use the latest version of TrueCrypt and run its benchmark mode over 1GB of in-DRAM data. Results shown are the GB/s average of encryption and decryption.
One of the freeware compression tools that offers good scaling performance between processors is 7-Zip. It runs under an open-source licence, is fast, and easy to use tool for power users. We run the benchmark mode via the command line for four loops and take the output score.
Chromium Compile (v56)
Our new compilation test uses Windows 10 Pro, VS Community 2015.3 with the Win10 SDK to combile a nightly build of Chromium. We've fixed the test for a build in late March 2017, and we run a fresh full compile in our test. Compilation is the typical example given of a variable threaded workload - some of the compile and linking is linear, whereas other parts are multithreaded.
Despite originally coming out in 2008/2009, Futuremark has maintained PCMark8 to remain relevant in 2017. On the scale of complicated tasks, PCMark focuses more on the low-to-mid range of professional workloads, making it a good indicator for what people consider 'office' work. We run the benchmark from the commandline in 'conventional' mode, meaning C++ over OpenCL, to remove the graphics card from the equation and focus purely on the CPU. PCMark8 offers Home, Work and Creative workloads, with some software tests shared and others unique to each benchmark set.
SYSmark 2014 SE
SYSmark is developed by Bapco, a consortium of industry CPU companies. The goal of SYSmark is to take stripped down versions of popular software, such as Photoshop and Onenote, and measure how long it takes to process certain tasks within that software. The end result is a score for each of the three segments (Office, Media, Data) as well as an overall score. Here a reference system (Core i3-6100, 4GB DDR3, 256GB SSD, Integrated HD 530 graphics) is used to provide a baseline score of 1000 in each test.
A note on context for these numbers. AMD left Bapco in the last two years, due to differences of opinion on how the benchmarking suites were chosen and AMD believed the tests are angled towards Intel processors and had optimizations to show bigger differences than what AMD felt was present. The following benchmarks are provided as data, but the conflict of opinion between the two companies on the validity of the benchmark is provided as context for the following numbers.
Benchmarking Performance: CPU Legacy Tests
Our legacy tests represent benchmarks that were once at the height of their time. Some of these are industry standard synthetics, and we have data going back over 10 years. All of the data here has been rerun on Windows 10, and we plan to go back several generations of components to see how performance has evolved.
Parts and Performance
The three Skylake-X cores launched today are the Core i9-7900X, the Core i7-7820X, and the Core i7-7800X: 10, 8 and 6 core parts respectively using the updated Skylake-SP core, the new cache topology, and the new mesh. With some of the tests benefitting from the new features and others taking a backseat, we had a wide range of results. The most poignant of which should be when we pit this generation 10-core over last generations 10-core. The Core i9-7900X has a frequency advantage, an IPC advantage, and a significant price advantage, which should make for an easy steamrolling.
In the end, this is what we get: aside from some tests that are L3 memory sensitive such as DigiCortex, WinRAR, and some of the PCMark8 tests, the Core i9-7900X wins every CPU test. For anyone who was unsure about getting the 10-core on the last generation on a compute basis, this new one seems to be the one to get.
최신 스카이레이크-X 메인보드와 바이오스에 문제가 있기 때문에 성능은 추후 더 향상될 것.