Virtually all of your choices are going to be driven by the device itself and the traffic required to make use of the device. Your design will, necessarily, be driven by what the device is, how it works, how you want developers to interact with it, and how users want to use the device.
1) The question of thread management is the same everywhere--use a separate thread to keep other things from blocking if that's going to be a problem. Of course, that doesn't make the problem go away, it just changes the problem from blocking to concurrency/synchronization. But that's a different problem. All other things being equal, a separate thread for reads and writes dramatically simplifies how you decouple the IO from how you process the IO.
2) That depends entirely on what the device sends/recieves, both at the low level and at the design level.
3) Would the commands, fixed constants, or other aspects be subject to change? If the device is a work in progress, it certainly will. The most successful implimentation I've ever done was to create the IO as a read/write layer, connected to an interpreter layer that actually read the grammar from a resource, topped by an API layer for use by applications. That was for a device that was in development for a very long time and the command set changed frequently; that design made it possible to use the same code for multiple devices. Whether or not that works for you depends, again, on the device.