-
xchg rax, rax – 0x01
On the next page of xchg rax, rax, we’re given a very simple program:
We know from the previous post how loop works. Each time it loops it decrements the rcx register. So we know that we need to set the register to something other than zero if we want to tinker with it, so I set it to 10.
The xadd is the instruction of interest, and that is entirely the body of the loop. xadd is exchange and add.
The Intel x86-64 reference manual describes it as “Exchanges the first operand (destination operand) with the second operand (source operand), then loads the sum of the two values into the destination operand.”
So all we are doing in the loop is adding and exchanging the values in the rax and rdx register. The book offers no hints on what the code is supposed to do, so the best we can do here is tinker with the value of the registers and see if the results are anything clever. We can make some guesses though. If the registers are both zero, we can figure that nothing interesting will ever happen. The loop will keep adding zeros until the loop counter reaches zero.
You might recognize what this does just by looking at the assembly. As a hint, set rax to 1, and rdx to 1, and watch the value of rax. Here are the values of rax after each iteration of the loop:
Initial: rax = 0x0000000000000001 rdx = 0x0000000000000001 rcx = 0x000000000000000a Next: rax = 0x0000000000000002 rdx = 0x0000000000000001 rcx = 0x0000000000000009 Next: rax = 0x0000000000000003 rdx = 0x0000000000000002 rcx = 0x0000000000000008 Next: rax = 0x0000000000000005 rdx = 0x0000000000000003 rcx = 0x0000000000000007 Next: rax = 0x0000000000000008 rdx = 0x0000000000000005 rcx = 0x0000000000000006 Next: rax = 0x000000000000000d rdx = 0x0000000000000008 rcx = 0x0000000000000005
And so on until rcx reaches zero.
You might recognize this as the Fibonacci sequence. Just about any developer at one point has tried implementing the Fibonacci sequence, either to learn a new language, for fun, or for school.
I find it impressive that using assembly you can accomplish this with a single instruction and a loop.
-
xchg rax, rax – 0x00
I recently picked up the book xchg rax,rax. This book is fascinating to me, so I thought I would blog about my interpretations of it one page at a time. I’m not an assembler expert, but I want to get better at it. I’m not going to go over how to run the assembly, there are a lot of posts out there to get started.
A little background on the book, in case this blog series doesn’t make sense: neither does the book. The book is 63 pages of x86-64 assembly snippets. Other than the requisite copyright notices, there are no words. There is no context to each snippet, it’s up to reader to interpret them. Fun! The book is also freely available online.
The first page, 0x00 is fairly simple. It demonstrates different ways to zero a register.
The first instruction zeros the eax register:
This is the most common way I see to zero a register. XORing any number with itself will produce zero. It offers a very compact encoding size. Almost every function prolog is zeroing registers, so it’s a task that needs to be done quite often.
The second instruction zeros the ebx/rbx register:
lea, or load effective address simply loads the address zero and stores it in the destination operand, rbx. There isn’t anything better about this approach, but it’s a way you can do it.
The next one is a bit more interesting:
The loop instruction does what it implies: it loops. The $ in this case means the current address counter. So, we’re looping to the same place, over and over again. However each time the loop executes, it decrements the ecx/rcx register. When the register reaches zero, the execution continues with the instruction after the loop instruction. It’s a very inefficient way to zero the ecx/rcx register.
The next one is bit more obvious:
This moves the value zero into the edx/rdx register. This is a much quicker way to zero a register, but it has a higher encoding size. The zero ends up getting encoded as a 64-bit value to move into the rdx register. That’s several bytes just to zero a register.
The next is similar, too:
This does a bitwise AND with the esi/rsi register (the source) and zero, and stores the result in esi/rsi. ANDing any number with zero will always produce zero.
The next one uses subtraction:
It subtracts the edi register from itself and stores the result in the first operand, the edi register.
And finally, it ends with this:
The first instruction pushes zero onto the the stack, and the second pops the value off the stack into the rbp register. This uses two whole instructions for zeroing a register, it isn’t exactly efficient.
That wraps up the first page.
-
Parsing and modifying HTML in a Fiddler Extension
Continuing my “do everything in Fiddler” approach to web debugging, I ran into a situation where I wanted to parse and modify the response of the server before the browser received the response using Fiddler.
It’s definitely doable, but there wasn’t a clear cut example on how to do that, so here we go.
The best to start is Telerik’s documentation on building an extension. This covers the ins and outs of getting started with developing an extension. Once you have a “hello world” extension working, you’re ready to start parsing HTML.
The Fiddler interface of choice here is going to be
IAutoTamper2
, and use the interface methodAutoTamperResponseBefore
.AutoTamperResponseBefore
is where we want to modify the HTML. This method is called after Fiddler has received the response from the server, but before it has pushed it to the browser. Modification’s to the response body here will be reflected in what the browser renders.There are a few guard checks we want to make first. Since we want to modify HTML, we should check that the response is actually HTML. We can partially accomplish this by examining the Content-Type header. If it contains “text/html”, then there is a good chance the content is HTML. Consult the IANA registry for other content types you may want to handle.
read more... -
AWS Lambda Gets Useful with VPC support
OK, so this post’s title is a bit harsh, but AWS Lambda has added something really great.
To back up, Lambda is a service offered by AWS as a means of running code without jumping head first into full blown EC2 instances or Containers. They can do some very interesting things, such as using them as responses to AWS API Gateways, etc.
Previously, there was one big hurdle to using Lambda for us. You couldn’t place them inside of a VPC. This means that whatever Lambda is accessing had to be publicly accessible. Most of our infrastructure is private within the VPC, and you couldn’t access it from the outside. Moreover, we didn’t want to make it accessible from the outside.
There was a thread on the AWS Forums about this, and AWS listened. You can now place a Lambda function inside of a VPC. More importantly, you can assign them in to security groups.
The use for this is very interesting to us, as, now we can use it without exposing things to the outside we didn’t want to. One interesting case might be to act as a cron job. If you want something to run periodically, but don’t want to worry about where that cron job lives, Lambda is a good place to start.
As an example, we may want to periodically run optimize on our SOLR cluster. Well, with Lambda, we can now do that.
We have a simple node.js script that hits our SolrCloud cluster with a GET request to http://internal-solr-cluster:8983/solr/ourcollection/update?optimize=true.
Previously, as a Lambda function, it would not have been able to access the internal-solr-cluster Elastic Load Balancer. Once we assigned it to a VPC, placed it in the right security groups, and specified a CloudWatch Event to run on a schedule of once a week, we now have our SOLR collection getting optimized once a week without having to worry where the optimization runs from.
-
Regaining Access to OS X after a lost Yubikey
The Yubikey by Yubico has an interesting use beyond just OTP. It can do a myriad of things, including storing certificates, OATH, and, more interestingly, HMAC-SHA1 challenge response. The last of which is interesting because it can be used with a PAM module.
OS X supports PAM modules, and one of Yubico’s touted features is that you can install a PAM module on OS X, and you now have two factor authentication into your OS X account. In addition to the password, the Yubikey must also be plugged in.
I set that up a while ago and it had been working fine, but I ran into a situation where I needed to turn it off, temporarily, because I couldn’t actually log in. Say, because I didn’t have my Yubikey with me.
Turns out this is really trivial. Just boot the Mac into recovery mode by holding Command+R during boot. This let me edit the
/etc/pam.d/authorization
file and comment out the Yubico PAM module. Once saved, a quick reboot command later, I was back into my account, two factor turned off. The only thing to note is that you want to edit the one on your Macintosh HD volume under/Volumes
, not the authorization file that the recovery partition uses.This made my life easier, but it also led me to believe the Yubikey PAM module on local OS X accounts had diminished value (the story is different for remote authentication). If I can just turn it off with very little effort, no authentication required, that’s worrying.
There is a way to partially fix it – which is FileVault2. When you boot into the Recovery console with FileVault2 enabled, you cannot edit
/etc/pam.d/authorization
without knowing the password to the volume since it is encrypted with your password. This however, still reduces authorization to a single factor. If I have your password and no Yubikey, even with FileVault2 enabled I can get in to the account since I have physical access.This takes a few seconds of extra work. First, you need the UUID of the volume that you need to decrypt (like “Macintosh HD”).
diskutil coreStorage list
and grab the UUID of the logical volume. From there, it’s just one more command:
diskutil coreStorage unlockVolume <UUID> -stdinpassphrase
Enter your password, and then the volume will be mounted in
/Volumes/
.In an ideal world, the Yubikey would play a role in unlocking the FileVault2 volume. This is easy enough to do with BitLocker and certificates since the Yubikey can act like a PIV card. However I find this not possible with FileVault2. Even in the case of BitLocker, it’s difficult to accomplish this without the help of being on an Active Directory Domain Joined machine and using an Active Directory account.
My advice would be, take the value that the Yubikey PAM module gives with a grain of salt for local account protection. At least on OS X (I have yet to bother trying on Windows) it’s quite easy to turn it off just by having access to the physical machine.
A lot of people will be quick to point out, “If you have physical access to the hardware, then it’s game over” however that doesn’t quite mean physical security should just be completely ignored. Each little improvement has value.