Read Access on Google Servers with XXE

Detectify explains how they gained read access to production servers at Google:

One system caught our eyes. The Google Toolbar button gallery. We looked at each other and jokingly said “this looks vuln!”, not knowing how right we were.


They were able to leverage XML External Entity ( XXE ) processing to read local files on Google’s production servers. If you haven’t read up on XXE go watch Mike Adams talk at WordCamp SF 2013, the video is only 30 minutes.

Be very careful when processing XML, it can come back to bite you in very bad ways.

Joe Gregorio on AtomPub and XML

In a follow up comment on his WebFinger post Joe Gregorio gives some perspective on AtomPub and XML:

Look, AtomPub has this problem, and if I had to do it all over again I would build AtomPub in JSON. An implementer wants to go from bits on the wire to a native data structure they can interact with in their programming language. I used to think that was laziness, or lack of knowledge, but it’s purely pragmatic. By using XML you have introduced a layer of indirection, you’ve taken a data structure and converted it into a tree based document which then has to be converted back into a native data structure, but now that has to be done on a per language basis.

I’m in the lazy/pragmatic camp :-)

I really don’t want documents to describe data, I just want a nice way to serialize the data and right now my favored way of doing that is JSON.

New Default XML Formats For MS Office

Tonight’s big announcement about MS Office turned out to be that the next version of MS Office will default to an XML format. The file extensions are going to change to reflect this; .DOCX (for Word), .XLSX (for Excel) and .PPTX (for Power Point). Scoble has a video interview with Brian Jones about the XML format on Channel 9.

I watched most of the interview with Brian Jones and it seems like MS might actually be doing the right thing with this format. Of course the proof will be in the pudding and I won’t believe it until I see it in a shipping product. The files that Word, Excel and Power Point will produce are actually zip files with several XML files with in it, along with any other pieces that might be imbedded in the document, like an image. So the theory is that these new default formats will actually be smaller than the current binary versions because of zip compression.

This is huge news, but I’m having a hard time getting excited about it. For some reason this announcement feels akin to MS announcing that the year is no longer 2001, which everyone but MS already knows. So what is the big deal? Listening to Scoble and Brian Jones gush about how many new and great things they’ll be able to because the new format stores the document in plain text XML made me want chant Unix, Unix, Unix, Unix. Hello, a good portion of the world has already figured that storing things in plain text when ever possible is a good thing.

Don’t get me wrong, I’m glad MS is taking this step, and it is a big step forward for them. I guess I just expect more from a company that has more resources that many small countries put together. They should have someone there smart enough to push for something like this in MS Office years ago, otherwise what is the point of having $50+ billion in the bank?