1. Version 4 of PhotoRaw and PhotoRaw Lite are now available on the App Store. The new version is a full 64-bit rewrite of PhotoRaw, and takes full advantage of the speed of Apple's new devices. I've talked in previous posts about the speed advantages that 64-bit operation can bring.

    If you haven't tried PhotoRaw since the first version of PhotoRaw and the iPad 1, you should try the new version on an iPad Air - you'll be surprised......
    0

    Add a comment

  2. In my last blog post, I went all technical, and talked about how to use the SIMD hardware acceleration, otherwise known as NEON, on Apple's new 64-bit processor (aka ARM64, aka ARMv8-A) on the iPad Air and iPhone 5s.

    But the question is, is it actually worthwhile? Writing code for a SIMD processor is hard at the best of times, and in this case the documentation is near non-existent, and Apple's compiler is buggy. (It turns out that Apple use their own undocumented instruction naming convention which is aliased to the ARM names. But sometimes, the alias isn't quite right.)

    Now this is interesting because the new processor actually has two distinct personalities - it can either be a 32-bit processor (aka ARMv7), which looks just the same as previous generation iPad/iPhone processors, or it can be a 64-bit processor with a different instruction set (ARMv8-A). By way of background, there have been numbers of "experts" on the web stating that 64-bit would make no difference. Which is of course in theory true - all other things being equal, you can build a 32-bit processor as fast as a 64-bit one, but that rather misses the point. The point being, did Apple decide to make all other things equal, or not? Given X amount of chip area to work with, Apple could choose to use that area either to make the 32-bit part of the chip fast, or the 64-bit part.

    So I set out to find out (a) just how fast the new iPad is in imaging applications, and (b) whether either 64-bit mode or using the SIMD instruction set would make a significant difference.

    The benchmark

    My interest in this is practical, and is just about about how to optimize my products, in this case PhotoRaw. So I chose to measure the performance of just one stage of PhotoRaw's pipeline which happens to be fairly "SIMD friendly", and is already SIMD accelerated for 32-bit under the older ARM processors. Note:

    • This is just a single point test - the stage in question is typical of an image processing pipeline, but your results may vary. A lot. Also, it's real production code, and it's the whole stage, so when I say SIMD, that actually means a mix of SIMD and C++.
    • The stage is multi-threaded, so will use all cores. Specifically, note that the iPad 1 is single core vs the later iPad's two core architecture.
    • The NEON SIMD code is hand optimized. Interestingly, the SIMD code in the core loop on the 64-bit ARMv8-A is 23 instructions vs. 27 for the 32-bit code, so about a 15% saving there, although that's not hugely meaningful as different instructions take different numbers of cycles to complete.
    • Finally, it so happens that this stage runs identically in AccuRaw, so allows me to also benchmark the same code, in X86 form, on a Intel Core i7 processor, which is quad core.

    The results

    Times in mS, lower is better.

    C++ SIMD
    iPad 1 6,056 2,813
    iPad 4 581 514
    iPad Air 32-bit 321 474
    iPad Air 64-bit 230 108
    Intel Core i7 4.2 GHz 46


    The results are interesting, and probably not quite what you'd expect:
    1. Unsurprisingly, the iPad 1 just gets completely outclassed - it has a slow single core processor, and just can't keep up at all. On the iPad 1 however, SIMD makes a real difference, which is how SIMD originally found its way into PhotoRaw.
    2. The iPad 4 is much better, but there's a surprise - SIMD code only helps a little.
    3. On the iPad Air, there's another surprise - running in 64-bit mode instantly gains you about 50% - 230mS vs 321 just using compiled C++ code.
    4. SIMD on the iPad Air is the real shocker. Firstly, in 32-bit mode, it's slower than straight C++ code. If I had to guess, I'd say that Apple deliberately built the 32-bit SIMD side of the new chip to just match the iPad 4, for compatibility reasons. However, in 64-bit mode, it's screamingly fast, clocking a full 5 times faster than the iPad 4, and twice as fast as compiled C++.
    5. Apple have claimed the that the processor in the iPad Air is "desktop class". Well, sort of. Versus a Core i7 clocking at 4.2 GHz, its about 1/5 the speed. But on a per core basis, that's close to half the speed. That from a device that including memory, screen, battery, etc takes up about 10% of the space that the Core i7's heat sink and fan take up!!!!!

    Conclusions

    First conclusion - if you were wondering whether the whole bother of rebuilding apps for 64-bit is worthwhile, then the answer is that if they are CPU intensive imaging apps, then it is probably worth the bother. You can expect a 50% uptick in performance right there. 

    Second conclusion - SIMD might be worthwhile for you, but only if you're going to 64-bit mode and have a real need. Otherwise, don't bother.

    Third conclusion - all those web "experts" that said that 64-bit doesn't matter - well, Apple made it matter.

    Finally, various people have speculated as to whether Apple's 64-bit chip could find its way into a desktop product. The answer is, yes, probably. If you built a 4-core version, up-clocked it and added heat sinking, it probably still wouldn't quite compete with the top-of-the-line Intel chips. But it would be quite capable.
    0

    Add a comment

  3. So this one is for the serious techies.

    It's been widely publicized that the new iPad Air and iPhone 5s have 64-bit processors. What's not not so well understood is that the 64-bit processors can actually either run old 32-bit code, or new 64-bit code. If you run the new 64-bit code, you're running an instruction set that's quite different to the old one; it's not like it just got bigger registers.  Apple just refer to the new architecture as ARM64, but in official ARM speak, the new instruction set is actually ARMv8-A; the 32-bit set was ARMv7.

    Now a lot of apps used the NEON SIMD extensions to improve performance - SIMD instructions effectively perform the same instruction simultaneously on multiple pieces of data, so speeding up operations.

    How to write NEON code on the old 32-bit iDevices was well understood; there were libraries such as Math-Neon available that you could look at and/or use directly. Not so for the 64-bit devices; there is not much out there other than ARM's deep-dive engineering manuals. The shortest ARM document I found is a 112 page "Instruction Set Overview", and is not exactly easy reading, and of course is not Apple specific. In fact, ARM don't seem to even call the SIMD extensions NEON anymore, it's just "Advanced SIMD Floating-Point". The only third-party exception that I found are the folks at Linaro, who have some presentations about porting to ARMV8, and have also ported the libjpeg-turbo library to ARMv8.

    However, the Linaro code isn't Apple friendly, doesn't show how to do ASM blocks, how to select between 32-bit and 64-bit mode, etc.

    Better yet, Apple refuse to help. I asked their Apple Developer Technical Support (DTS) service for suggestions on best practice with a paid tech support incident. What I got back was : "Thank you for contacting Apple Developer Technical Support (DTS). DTS does not provide support for ARM assembly." So not much support there.

    However, not one to let an absence of documentation or support stop me, below is a simple technology demonstrator of how its possible to support NEON-32 and NEON-64 code in the form of inline asm blocks in an Xcode based project.

    The example below is quite simple, just a 3x3 matrix by 3 element vector multiply routine, but it shows a structure that works. The 64-bit code is just a straight port of the 32-bit code; it may be possible to further optimize it, but this is just intended to show the principle.


    #if !defined(__i386__) && defined(__ARM_NEON__)
    #if  (!defined(__LP64__) && !defined(_LP64))
    #define __MATH_NEON_32
    #else
    #define __MATH_NEON_64
    #endif
    #endif

    void
    matvec3_RowMajor(float matrix[3][3], float v[3], float d[3])
    {
    float *m = (float *) matrix;
    d[0] = m[0]*v[0] + m[1]*v[1] + m[2]*v[2];
    d[1] = m[3]*v[0] + m[4]*v[1] + m[5]*v[2];
    d[2] = m[6]*v[0] + m[7]*v[1] + m[8]*v[2];
    }

    void __attribute__((noinline)) matvec3_neon_RowMajor(float m[3][3], float v[3], float d[3])
    {
    #if defined(__MATH_NEON_32)
      __asm__ volatile (
        "vld1.32 {d6}, [%1]! \n\t"         // Q3 = v
        "vld1.32 d7[0], [%1] \n\t"
     
        "vld3.32 {d0, d2, d4}, [%0]! \n\t" // Q0 = {x, m0_6, m0_3, m0_0} = {d1, d0}
        "vld3.32 {d1[0], d3[0], d5[0]}, [%0] \n\t" // Q1 = {x, m0_7, m0_4, m0_1} = {d3, d2}
                                                                            // Q2 = {x, m0_8, m0_5, m0_2} = {d5, d4}
     
        "vmul.f32 q9, q0, d6[0] \n\t" // Multiply out
        "vmla.f32 q9, q1, d6[1] \n\t" //
        "vmla.f32 q9, q2, d7[0] \n\t" //
        "vmov.f32 q0, q9 \n\t" //
     
        "vst1.32 d0, [%2]! \n\t" //r2 = D24
        "fsts s2, [%2] \n\t" //r2 = D25[0]
     
        : "+r"(m), "+r"(v), "+r"(d)
        :
        : "q0", "q1", "q2", "q3", "q9", "memory"
      );
    #elif defined(__MATH_NEON_64)
      __asm__ volatile (
        "ld1 {v3.2s}, [%1], 8 \n\t" // V3 = v
        "ld1 {v3.s}[2], [%1] \n\t"

        "ld3 {v0.2s, v1.2s, v2.2s}, [%0], 24 \n\t" // V0 = {x, m0_6, m0_3, m0_0}
        "ld3 {v0.s, v1.s, v2.s}[2], [%0] \n\t" // V1 = {x, m0_7, m0_4, m0_1}
                                                                            // V2 = {x, m0_8, m0_5, m0_2}
                          
        "fmul v9.4s, v0.4s, v3.s[0] \n\t" // Multiply out
        "fmla v9.4s, v1.4s, v3.s[1] \n\t" //
        "fmla v9.4s, v2.4s, v3.s[2] \n\t" //
                          
        "st1 {v9.2s}, [%2], 8 \n\t" // Result in V9
        "st1 {v9.s}[2], [%2] \n\t"
     
        : "+r"(m), "+r"(v), "+r"(d)
        :
        : "v0", "v1", "v2", "v3", "v9", "memory"
      );
    #else
    matvec3_RowMajor(m, v, d);
    #endif
    }
    0

    Add a comment

  4. Lloyd Chambers, of the diglloyd blog, recently published a review of the Nikon D800M in which he used AccuRaw Monochrome for raw processing. So what is AccuRaw Monochrome and why should you care? Here's the story:

    By way of background, the D800M is a Nikon D800 that's been modified by the folks at MaxMax.com to remove the Bayer color filter layer on the sensor, creating a pure monochrome camera. So, no Bayer demosiaicing artifacts, beautiful tonality, etc, etc. Now you could just go out and buy a pure monochrome camera off the shelf, in the form of Leica's M Monochrom, which I spoke about on this blog in some previous posts. Problem is, once you've bought an M Monochrom, and a few lenses, you won't have much change from say $20,000. And then, much as I love Leica's M cameras, you have a camera that is really only at its best for lenses between 35mm and 75mm, doesn't have auto-focus, etc, etc. Enter the folks at MaxMax. A lot of their work is for scientific and engineering applications, but they will build you a camera modified to pure monochrome. You can chose anything from a pocketable point-and-shoot to a top-of-the-line Nikon. Also, you can also get cameras without IR filters, UV filters, etc, etc to suit your intended use.

    Lloyd's various posts go into detail about image quality, usability, etc, and are well worth the read if you're interested in monochrome work, so I won't go into any of that here. What I will talk about is the technicalities of processing the image from a camera modified to monochrome.

    The problem of course is, how do you do the raw processing? Well, if you're handy with command line options and don't mind a fairly complex multi-step process, you can persuade DCRaw to treat the image as a single color. Or you can just use a conventional raw processor such as ACR or Lightroom (or the "normal" version of AccuRaw, for that matter). Of course, the raw processor is going to think that it's still dealing with a Bayer matrix camera, and as a result is going to try to demosaic monochrome data. While the end result of that is actually not as bad as you might imagine, it isn't ideal (see later for an example).

    Enter AccuRaw Monochrome. Now, to be clear, "AccuRaw" and "AccuRaw Monochrome" are separate products. AccuRaw Monochrome is dedicated to monochrome applications. It's primary target is actually conventional unmodified off-the-shelf cameras. For those cameras, AccuRaw monochrome has a special demosaicing algorithm dedicated to creating monochrome images. Because AccuRaw Monochrome is dedicated just to that, it can do a better job than conventional demosaicing algorithms that are optimized for good color results.

    However, AccuRaw Monochrome also has another trick up its sleeve - it can also do true monochrome processing for a camera modified to monochrome operation such as by MaxMax. It's in this role that Lloyd Chambers was using the AccuRaw Monochrome beta. (If you'd like to be a beta tester, either with a conventional or modified camera, drop me an email to the contact address on the AccuRaw website.)

    True raw processing

    To demonstrate the difference that true monochrome processing makes, I'll use an image provided by the folks at MaxMax.com that was created with a D800M . Firstly, let's look at the entire image, as processed with Lightroom 5.2 and AccuRaw Monochrome Beta 0.9.1. Note that I'm using Lightroom here just because that is what is commonly used - you would get similar results from any conventional raw processor, including the normal version of AccuRaw for example.

    Lightroom 5.2, default settings, saturation set to zero


    AccuRaw Monochrome Beta 0.9.1, default settings

    Looking at the entire image, reduced to a size that fits well on this page, there is no discernible difference between LR and AccuRaw Monochrome. The images might as well be identical - exposure is the same, overall contrast is the same, etc. However, let's take a closer look. But before we do that, hold onto the "overall contrast is the same" thought - it will be important later.

    100% crops

    Now lets look at some 100% crops:

     Lightroom 5.2, 100% crop, default settings, saturation = 1


    Lightroom 5.2, 100% crop, default settings, saturation = 0



    AccuRaw Monochrome beta, 100% crop, default settings

    The first Lightroom crop, with saturation set to 1, clearly shows the demosaicing problem in the form of color artifacts, e.g., the "On" lettering and the "System" lettering. Setting the saturation to 0, in the second crop, sorts that (and the white balance) out. However, the artifacts are still there - there're just not as obvious. In the third crop, this from AccuRaw Monochrome, those artifacts aren't there, and the image is noticeably sharper all over.

    400% crops

    Looking at some crops at 400% will make what is happening more obvious:

    Lightroom 5.2, 400% crop, default settings, saturation = 0 


    AccuRaw Monochrome beta, 100% crop, default settings

    Comparing the two 400% crops, what you see is really two things - firstly, the demosaicing process is creating artifacts - you can most easily see that around the lettering. But secondly, and more subtlety, there is also a loss of micro-contrast in the LR image relative to the AccuRaw Monochrome image. E.g., take a look at the "2000i" text. And remember, the overall contrast is the same. That loss of micro-contrast makes a real difference, and is primarily why AccuRaw Monochrome's 100% crop looks much sharper all over, not just where there are artifacts.

    Now I just need to work on persuading myself that a MaxMax modified Nikon Df is absolutely necessary to my artistic development.......might not be hard.....
    2

    View comments

  5. For that have been looking for an opportunity to buy AccuRaw, it's 30% off on the App Store for the whole black Friday weekend, through cyber Monday.

    AccuRaw on the App Store
    0

    Add a comment

  6. Sean Reid, of Reid Reviews, just published a follow-up article to his test of the Fujifilm XF27/2.8 lens, titled "XF Lens Res. Revisited". In this article Sean repeats a number of his previous tests of Fuji and Zeiss lenses, but this time using AccuRaw. In addition, he also tests the effect of software correction of lens aberration on lens resolution. Unsurprisingly, software correction of lens aberration can result in noticeable reductions in resolution. Many modern cameras are sold with lenses that have very significant levels of aberration that is automatically corrected by the raw developer. But "there's no free lunch".

    Resolution loss due to software correction is a topic that isn't often discussed, and so Sean's article is well worth the read. You can find it here. Note that access to Sean's site is by subscription only;  Sean doesn't take any advertising, allowing him to produce reviews without fear or favor.
    0

    Add a comment

  7. Sean Reid, of Reid Reviews, just published his extensive test of the Fujifilm XF27/2.8 lens. As usual with Sean's reviews, it very comprehensively covers the lens' performance as regards resolution, aberration, vignetting, etc.

    For this review however, Sean also took things a little further, and tested the effect of the raw converter used on apparent resolution. He tested Lightroom, AccuRaw, Raw Converter EX, Capture One and Iridient Developer. You might think that all of the converters were about the same. Not so. In fact Sean concluded that "The results above seem almost as if they could have come from five different lenses".

    And which converter came out on top? Sean goes on to say that "And, while its conversions may show some artifacts, it seems clear to me that the AccuRaw conversion does the best job of showing us the resolution levels this lens is capable of".

    Sean's review is well worth the read. You can find it here. Note that access to Sean's site is by subscription only;  Sean doesn't take any advertising, allowing him to produce reviews without fear or favor.
    2

    View comments

  8. Hans van Eijsden, a fashion and portrait photographer based in the Netherlands, just published a really great article describing how he goes about getting skin tones just right. Most important, he describes how he does so without spending hours manually editing each image. For a professional photographer, time is money, so an automated, repeatable method is really important. In brief, Hans' method is to use an X-Rite ColorChecker Passport in combination with some clever profile editing in Adobe's DNG Profile editor, together with some knowledge gleaned from using dcpTool. He also describes in some detail why the "hue twists" I've described in previous posts (e.g., here and here) might be great for amateur photographers that just want an image that looks good as soon as they open it, hue twists aren't always great for pros.

    If you have an interest in how a fashion photographer gets skin tones perfect in a real commercial setting, this is a must-read. While you're on Hans' site, you should also take the time to check out some of his work - there are some really great images there.
    7

    View comments

  9. Several people have asked me whether there will be a new version of PhotoRaw for iOS 7, and if so, will existing users have to pay. For those that don't know, PhotoRaw is my raw image viewer/editor for iPhones and iPads.

    The good news for users is that:
    • Yes, there is a brand new version of PhotoRaw for iOS 7, complete with a cool new iOS 7 look and feel. Even better, it's available right now on the Apple App store.
    • And......it's a free upgrade for all existing users. I know a lot of app developers have taken the opportunity to build entire new versions of their apps, so forcing existing users to pay again. But the new version of PhotoRaw is a free upgrade to anyone that's ever purchased PhotoRaw, right back to day one.
    Comparative screenshots of PhotoRaw 3.7.1 are below, running on iOS 6 and iOS 7 respectively. Note that iOS 6 users don't get the new look and feel - you have to upgrade to iOS 7 for it to show up.

    Enjoy!






    0

    Add a comment

  10. Yesterday the web lit up with variations on "Adobe to bring Lightroom-style photo editing to tablets". Originally from Scott Kelby's "The Grid", and then picked up by CNET and multiple others. The big deal being the ability to process raw images on the iPad.

    Guess what guys? - it may be news that Adobe is doing this, but the technology isn't new at all. Actually, Adobe's 18 months to two years behind the curve. There are at least two apps on the App Store ready for purchase that have been doing most of what Adobe demo'd. They've been available for years now. PhotoRaw, my app, being the most popular of them.

    In fact, Adobe may be more that two years behind the curve here. PhotoRaw works on a first generation iPad while, so far as I can tell, the Adobe demo needs the latest fourth  (corrected - see comment below) second generation iPad. Very different.

    Maybe a bit of research rather than just parroting the press release.......just saying......
    3

    View comments

Popular Posts
Blog Archive
About Me
About Me
My Photo
Author of AccuRaw, PhotoRaw, CornerFix, pcdMagic, pcdtojpeg, dcpTool, WinDat Opener and occasional photographer....
Loading