Sunday, July 20, 2014

PhotoRaw 4.0.7 - Nikon D810, Sony A7S, A77 M2, etc support

PhotoRaw 4.0.7 for the iPhone and iPad is now available on the Apple App Store. Version 4.0.7 adds raw image support for new cameras including the Nikon D810, Nikon 1 J4, Nikon 1 S2, Panasonic DMC-FZ1000, Sony Alpha 77 II and Sony Alpha A7S.

Enjoy!

Friday, April 25, 2014

Leica T (Typ 701) raw file (DNG) analysis

I just took a quick look inside a DNG from one of Leica's new Leica T cameras:
  1. The camera still appears to be using beta software - no firmware version number shows, etc.
  2. The camera name shows as "Leica T (Typ 701)"
  3. The image data is 12-bit. There is no compression used in the DNG I looked at. Somewhat unusually, the data is appears to be packed, four 12-bit values in 6 bytes, rather than the more typical one 12-bit value in a 16-bit location. This is allowed by the DNG spec, but isn't often used. This means that the file size is approximately 24.5 MB vs. what would otherwise be approximately 33.6 MB.
  4. The DNG version is 1.3, a higher revision that that the 1.1 that most previous Leica cameras have used. There is a reason for this - DNG 1.3 allows for opcodes, which Leica use for lens correction.
  5. In the DNG I looked at, which was shot with a 18-56 Vario-Elmar lens, lens correction is done by a single "WarpRectilinear" operation in the DNG. Other lenses (or the same lens at a different focal length), might use other codes.
  6. There is a single Leica makernote.
Generally, the DNG seems to be quite standard - it happily works with current versions of all of my software - PhotoRaw, AccuRaw, AccuRaw Monochrome and Cornerfix "out of the box".

The only slightly interesting thing I noticed is the color rendering in the DNG. When I looked at Sean Reid's images in his review of the camera, I commented to him that the colors looked over saturated, especially the reds. Interesting, looking at a DNG alongside a JPEG preview, the JPEG preview is much more subdued than a DNG rendering using the Leica embedded color profile, especially in the reds. However, this is probably nothing that couldn't be fixed with a custom camera profile.

Tuesday, April 15, 2014

Importing raw images into Lightroom Mobile

A lot of people have found out that, contrary to what they might have assumed, you can't import raw files into Lightroom Mobile on the iPad. Only JPEGs are supported. To import raws, you need to go via a desktop machine. Which is not much use if you're in the field with only an iPad.

Fortunately, there an easy work-round with PhotoRaw - here's the step-by-step instructions:

  1. Install PhotoRaw on your iPad. Note that PhotoRaw Lite won't work for this.
  2. Import the raw files stored on your iPad into PhotoRaw.
  3. Batch export them - touch the batch button (the gears), select all the files, then touch the action button (the arrow), select JPEG quality, and touch save.
  4. Your images are now saved on the iPad as JPEG at full resolution, and can be imported into Lightroom Mobile just as you would any other JPEG
Easy!

Tuesday, April 8, 2014

Adobe Lightroom Mobile and Lossy DNG

Adobe's Lightroom Mobile is out. There are numbers of "first looks" and reviews on various sites, e.g. at MacWorld. I won't go into what Lightroom Mobile is and is not - the mainstream sites are doing that already - but there is one thing that the mainstream haven't picked on. That thing is that Adobe's Lossy DNG format finally starts to make some kind of sense. Well, sort of.

In a post back in January 2012, when Lossy DNG was introduced, I discussed the new format in not very complimentary terms. In fact, as a replacement for raw formats, I called it an "engineering abomination". But I also noted that as a replacement for JPEG rather than as a replacement for a raw format, it had some useful features. But the format remained a bit of a puzzle - for archival purposes, it was a dog, as Adobe themselves acknowledge, but as a replacement for JPEG, the question was who was going to adopt it? One obvious possible adopter would be the cameras companies as an in-camera format, but the camera companies, at least the significant players, were always unlikely to adopt something with an Adobe label on it.

The answer to this puzzle appears to be that Adobe themselves had plans for it. It appears the Lightroom Mobile uses lossy DNG as its format on the iPad. So when Lightroom Mobile talks about "raw", what they actually mean is Adobe lossy DNG as created by one of the Adobe desktop products. Aka, what is really a JPEG format on steroids rather than a raw format. Which in some ways is actually quite clever. It's not at all clear that this was the plan all along, but if it wasn't good long term planning back then, it's sure good improvisation now.

But is this clever in the long term? In the short term, this is certainly going to be good for Adobe - at least for Adobe's share price, which of course is very good for all of the employees on stock option schemes. The market loves the cloud, and Lightroom Mobile is very obviously designed to drive cloud adoption; signing up for one of Adobe's cloud based subscription schemes is the only way to get Lightroom Mobile. In fact, if I was in my Herm├Ęs-silk-tie-wearing-strategy-consultant role, I'd probably recommend this as a strategy, at least in the short term. But I'm not so sure that this is clever in the long term. It's already possible to run a full raw converter on an iPad. The early versions of my product, PhotoRaw, was frankly a novelty on the iPad 1; it was just too slow to be useful outside of niche situations. But a current version of PhotoRaw, which is way faster than the early versions even on an iPad 1, running on an iPad Air - the combination is a practical way of editing images in a lot of situations. In a few years, stand-alone raw developers on tablets will be mainstream. At that point, Lightroom Mobile may well look like a distraction rather than a smart idea.


Friday, December 27, 2013

The 64-bit version of PhotoRaw is out

Version 4 of PhotoRaw and PhotoRaw Lite are now available on the App Store. The new version is a full 64-bit rewrite of PhotoRaw, and takes full advantage of the speed of Apple's new devices. I've talked in previous posts about the speed advantages that 64-bit operation can bring.

If you haven't tried PhotoRaw since the first version of PhotoRaw and the iPad 1, you should try the new version on an iPad Air - you'll be surprised......

Tuesday, December 10, 2013

How fast is the iPad Air for image processing, and does NEON make sense?

In my last blog post, I went all technical, and talked about how to use the SIMD hardware acceleration, otherwise known as NEON, on Apple's new 64-bit processor (aka ARM64, aka ARMv8-A) on the iPad Air and iPhone 5s.

But the question is, is it actually worthwhile? Writing code for a SIMD processor is hard at the best of times, and in this case the documentation is near non-existent, and Apple's compiler is buggy. (It turns out that Apple use their own undocumented instruction naming convention which is aliased to the ARM names. But sometimes, the alias isn't quite right.)

Now this is interesting because the new processor actually has two distinct personalities - it can either be a 32-bit processor (aka ARMv7), which looks just the same as previous generation iPad/iPhone processors, or it can be a 64-bit processor with a different instruction set (ARMv8-A). By way of background, there have been numbers of "experts" on the web stating that 64-bit would make no difference. Which is of course in theory true - all other things being equal, you can build a 32-bit processor as fast as a 64-bit one, but that rather misses the point. The point being, did Apple decide to make all other things equal, or not? Given X amount of chip area to work with, Apple could choose to use that area either to make the 32-bit part of the chip fast, or the 64-bit part.

So I set out to find out (a) just how fast the new iPad is in imaging applications, and (b) whether either 64-bit mode or using the SIMD instruction set would make a significant difference.

The benchmark

My interest in this is practical, and is just about about how to optimize my products, in this case PhotoRaw. So I chose to measure the performance of just one stage of PhotoRaw's pipeline which happens to be fairly "SIMD friendly", and is already SIMD accelerated for 32-bit under the older ARM processors. Note:

  • This is just a single point test - the stage in question is typical of an image processing pipeline, but your results may vary. A lot. Also, it's real production code, and it's the whole stage, so when I say SIMD, that actually means a mix of SIMD and C++.
  • The stage is multi-threaded, so will use all cores. Specifically, note that the iPad 1 is single core vs the later iPad's two core architecture.
  • The NEON SIMD code is hand optimized. Interestingly, the SIMD code in the core loop on the 64-bit ARMv8-A is 23 instructions vs. 27 for the 32-bit code, so about a 15% saving there, although that's not hugely meaningful as different instructions take different numbers of cycles to complete.
  • Finally, it so happens that this stage runs identically in AccuRaw, so allows me to also benchmark the same code, in X86 form, on a Intel Core i7 processor, which is quad core.

The results

Times in mS, lower is better.

C++ SIMD
iPad 1 6,056 2,813
iPad 4 581 514
iPad Air 32-bit 321 474
iPad Air 64-bit 230 108
Intel Core i7 4.2 GHz 46


The results are interesting, and probably not quite what you'd expect:
  1. Unsurprisingly, the iPad 1 just gets completely outclassed - it has a slow single core processor, and just can't keep up at all. On the iPad 1 however, SIMD makes a real difference, which is how SIMD originally found its way into PhotoRaw.
  2. The iPad 4 is much better, but there's a surprise - SIMD code only helps a little.
  3. On the iPad Air, there's another surprise - running in 64-bit mode instantly gains you about 50% - 230mS vs 321 just using compiled C++ code.
  4. SIMD on the iPad Air is the real shocker. Firstly, in 32-bit mode, it's slower than straight C++ code. If I had to guess, I'd say that Apple deliberately built the 32-bit SIMD side of the new chip to just match the iPad 4, for compatibility reasons. However, in 64-bit mode, it's screamingly fast, clocking a full 5 times faster than the iPad 4, and twice as fast as compiled C++.
  5. Apple have claimed the that the processor in the iPad Air is "desktop class". Well, sort of. Versus a Core i7 clocking at 4.2 GHz, its about 1/5 the speed. But on a per core basis, that's close to half the speed. That from a device that including memory, screen, battery, etc takes up about 10% of the space that the Core i7's heat sink and fan take up!!!!!

Conclusions

First conclusion - if you were wondering whether the whole bother of rebuilding apps for 64-bit is worthwhile, then the answer is that if they are CPU intensive imaging apps, then it is probably worth the bother. You can expect a 50% uptick in performance right there. 

Second conclusion - SIMD might be worthwhile for you, but only if you're going to 64-bit mode and have a real need. Otherwise, don't bother.

Third conclusion - all those web "experts" that said that 64-bit doesn't matter - well, Apple made it matter.

Finally, various people have speculated as to whether Apple's 64-bit chip could find its way into a desktop product. The answer is, yes, probably. If you built a 4-core version, up-clocked it and added heat sinking, it probably still wouldn't quite compete with the top-of-the-line Intel chips. But it would be quite capable.

Saturday, December 7, 2013

iPad Air, iPhone 5S and 64-bit NEON code

So this one is for the serious techies.

It's been widely publicized that the new iPad Air and iPhone 5s have 64-bit processors. What's not not so well understood is that the 64-bit processors can actually either run old 32-bit code, or new 64-bit code. If you run the new 64-bit code, you're running an instruction set that's quite different to the old one; it's not like it just got bigger registers.  Apple just refer to the new architecture as ARM64, but in official ARM speak, the new instruction set is actually ARMv8-A; the 32-bit set was ARMv7.

Now a lot of apps used the NEON SIMD extensions to improve performance - SIMD instructions effectively perform the same instruction simultaneously on multiple pieces of data, so speeding up operations.

How to write NEON code on the old 32-bit iDevices was well understood; there were libraries such as Math-Neon available that you could look at and/or use directly. Not so for the 64-bit devices; there is not much out there other than ARM's deep-dive engineering manuals. The shortest ARM document I found is a 112 page "Instruction Set Overview", and is not exactly easy reading, and of course is not Apple specific. In fact, ARM don't seem to even call the SIMD extensions NEON anymore, it's just "Advanced SIMD Floating-Point". The only third-party exception that I found are the folks at Linaro, who have some presentations about porting to ARMV8, and have also ported the libjpeg-turbo library to ARMv8.

However, the Linaro code isn't Apple friendly, doesn't show how to do ASM blocks, how to select between 32-bit and 64-bit mode, etc.

Better yet, Apple refuse to help. I asked their Apple Developer Technical Support (DTS) service for suggestions on best practice with a paid tech support incident. What I got back was : "Thank you for contacting Apple Developer Technical Support (DTS). DTS does not provide support for ARM assembly." So not much support there.

However, not one to let an absence of documentation or support stop me, below is a simple technology demonstrator of how its possible to support NEON-32 and NEON-64 code in the form of inline asm blocks in an Xcode based project.

The example below is quite simple, just a 3x3 matrix by 3 element vector multiply routine, but it shows a structure that works. The 64-bit code is just a straight port of the 32-bit code; it may be possible to further optimize it, but this is just intended to show the principle.


#if !defined(__i386__) && defined(__ARM_NEON__)
#if  (!defined(__LP64__) && !defined(_LP64))
#define __MATH_NEON_32
#else
#define __MATH_NEON_64
#endif
#endif

void
matvec3_RowMajor(float matrix[3][3], float v[3], float d[3])
{
float *m = (float *) matrix;
d[0] = m[0]*v[0] + m[1]*v[1] + m[2]*v[2];
d[1] = m[3]*v[0] + m[4]*v[1] + m[5]*v[2];
d[2] = m[6]*v[0] + m[7]*v[1] + m[8]*v[2];
}

void __attribute__((noinline)) matvec3_neon_RowMajor(float m[3][3], float v[3], float d[3])
{
#if defined(__MATH_NEON_32)
  __asm__ volatile (
    "vld1.32 {d6}, [%1]! \n\t"         // Q3 = v
    "vld1.32 d7[0], [%1] \n\t"
 
    "vld3.32 {d0, d2, d4}, [%0]! \n\t" // Q0 = {x, m0_6, m0_3, m0_0} = {d1, d0}
    "vld3.32 {d1[0], d3[0], d5[0]}, [%0] \n\t" // Q1 = {x, m0_7, m0_4, m0_1} = {d3, d2}
                                                                        // Q2 = {x, m0_8, m0_5, m0_2} = {d5, d4}
 
    "vmul.f32 q9, q0, d6[0] \n\t" // Multiply out
    "vmla.f32 q9, q1, d6[1] \n\t" //
    "vmla.f32 q9, q2, d7[0] \n\t" //
    "vmov.f32 q0, q9 \n\t" //
 
    "vst1.32 d0, [%2]! \n\t" //r2 = D24
    "fsts s2, [%2] \n\t" //r2 = D25[0]
 
    : "+r"(m), "+r"(v), "+r"(d)
    :
    : "q0", "q1", "q2", "q3", "q9", "memory"
  );
#elif defined(__MATH_NEON_64)
  __asm__ volatile (
    "ld1 {v3.2s}, [%1], 8 \n\t" // V3 = v
    "ld1 {v3.s}[2], [%1] \n\t"

    "ld3 {v0.2s, v1.2s, v2.2s}, [%0], 24 \n\t" // V0 = {x, m0_6, m0_3, m0_0}
    "ld3 {v0.s, v1.s, v2.s}[2], [%0] \n\t" // V1 = {x, m0_7, m0_4, m0_1}
                                                                        // V2 = {x, m0_8, m0_5, m0_2}
                      
    "fmul v9.4s, v0.4s, v3.s[0] \n\t" // Multiply out
    "fmla v9.4s, v1.4s, v3.s[1] \n\t" //
    "fmla v9.4s, v2.4s, v3.s[2] \n\t" //
                      
    "st1 {v9.2s}, [%2], 8 \n\t" // Result in V9
    "st1 {v9.s}[2], [%2] \n\t"
 
    : "+r"(m), "+r"(v), "+r"(d)
    :
    : "v0", "v1", "v2", "v3", "v9", "memory"
  );
#else
matvec3_RowMajor(m, v, d);
#endif
}