I spent today learning more about ROS and how to process ROS bag files in python.
So I played around with some of the examples in the rosbag cookbook, and this example in particular was useful for me
today. From that example I learn that once you have created a bag object by
specifying the location of the bag file, you can read messages from it using the read_messages
method. This method has an optional topics
argument, that
allows you to filter for only messages from topics you want. In
yesterdays blog I identified that the most
important 'topics' for the Didi Challenge would be the "/image_raw"
and
"/velodyne_points"
topics. I decided to start with just the "/image_raw"
topic since it would be the easiest one to verify that what i am getting seems
reasonable.
import rosbag bag = rosbag.Bag("/path/to/bag_file.bag", "r") bg = bag.read_messages(topics=["/image_raw"])
The return value of the read_messages()
method is a python
generator. This is actually quite
convenient, because it will allow us to load single messages at a time, as we
need to process them, rather than loading the entire data from a bag file in one
go, which in the case of the Didi challenge can occupy up to 11GB worth of
memory.
Iterating through each item in the generator returns a tuple of three items:
topic, msg, t = bg.next()
Printing out msg using print(msg)
prints out a very long list of numbers,
which looks promising, as it could be the pixel values of the image. However,
it is not clear how to actually access these values from the message object
since it does not actually behave like a list, or a numpy array. Probing the
type reveals that it is a data type I have never encountered before.
>>> type(msg) tmplsf1md._sensor_msgs__Image
Probing the attributes and methods available to this object using dir(msg)
>>> dir(msg) ... 'data', 'deserialize', 'deserialize_numpy', 'encoding', 'header', 'height', 'is_bigendian', 'serialize', 'serialize_numpy', 'step', 'width'
The width
and height
properties looked like a promising way to verify that i
did actually have image data. Running them revealed that the data is 1400x512,
which does seem like a reasonable aspect ratio given the visualisation from
yesterdays blog.
>>> msg.width 1400 >>> msg.height 512
So it seemed reasonable that the image data would be stored in the msg.data
attribute.
>>> msg.data 1d\x16\x1c!"\x15\x17\x16\x13\x16\x18\x12\x16\x18\x1c\..... >>> type(msg.data) str
However, as can be seen from the above output, the data is stored as strings, encoded in some format that is completely incomprehensible to me, but which triggered my memory about something i had briefly seen a long time ago done with numpy. I remenbered a numpy function that parses arrays from strings that had similarly formatted strings. I looked up in the numpy documentation and decided to give it a try.
a = np.fromstring(msg.data, dtype=np.uint8)
It seemed to work, so I probed its values for sanity.
>>> a.min() 4 >>> a.max() 255 >>> a.shape (716800,) >>> msg.width * msg.height 716800
Everything seemed quite reasonable, the only thing was that the numpy array was flat, but it did match the number of expected values. So I reshaped the array, and converted it to a PIL image to visualize it.... and voila!!!! success :)
from PIL import Image im = a.reshape(msg.height, msg.width) Image.fromarray(im).show()
Now we can put all that code together into a something that will be useful when we decide to process the images.
import numpy as np import rosbag bag = rosbag.Bag(bag_file, "r") messages = bag.read_messages(topics=["/image_raw"]) num_images = bag.get_message_count(topic_filters=["/image_raw"]) for i in range(num_images): # READ NEXT MESSAGE IN BAG topic, msg, t = messages.next() # CONVERT MESSAGE TO A NUMPY ARRAY img = np.fromstring(msg.data, dtype=np.uint8) img = img.reshape(msg.height, msg.width) # DO SOME PROCESSING ON THE IMAGE # ...
Using the above code, in conjunction with opencv, we can view the video that is contained in the ROS bag.
import cv2 import numpy as np import os import rosbag # SETTINGS FOR SAVING OUTPUT VIDEO out_file = '/tmp/myVideo22.avi' # Filepath to save the video as fourcc_settings = cv2.VideoWriter_fourcc(*'XVID') out_vid_dims = (1400, 512) fps = 30 # adjust based on input video out = cv2.VideoWriter(out_file, fourcc=fourcc_settings, fps=fps, frameSize=out_vid_dims, isColor=False) # ROSBAG SETTINGS bag_file_dir = "/path/to/data" bag_file_name = 'approach_3.bag' bag_file = os.path.join(bag_file_dir, bag_file_name) # OPEN BAG bag = rosbag.Bag(bag_file, "r") messages = bag.read_messages(topics=["/image_raw"]) # Only get images data num_images = bag.get_message_count(topic_filters=["/image_raw"]) for i in range(num_images): # READ NEXT MESSAGE IN BAG topic, msg, t = messages.next() # CONVERT MESSAGE TO A NUMPY ARRAY img = np.fromstring(msg.data, dtype=np.uint8) img = img.reshape(msg.height, msg.width) # RESIZE TO OUTPUT DIMENNSIONS img = cv2.resize(img, out_vid_dims) # SHOW THE FRAME cv2.imshow('VidFrame', img) # SAVE THE FRAME TO THE OUTPUT VIDEO outvid = cv2.resize(img, out_vid_dims) out.write(outvid) # QUIT IF ESCAPE BUTTON PRESSED k = cv2.waitKey(1) & 0xFF if k == 27: break out.release() cv2.destroyAllWindows() print("DONE!")
Tomorrow I will do the same for the Lidar data, so see if the lidar data is also stored as point clouds, like the Kitti dataset. If it is, then I can make good use of the functions I created in the last few days here, here, and here.
Note you can comment without any login by: