ROS and ROS bags - Part2

March 30, 2017, 9:30 p.m.

Summary

I spent today learning more about ROS and how to process ROS bag files in python.

Cooking the Monster in the Bag

So I played around with some of the examples in the rosbag cookbook, and this example in particular was useful for me today. From that example I learn that once you have created a bag object by specifying the location of the bag file, you can read messages from it using the read_messages method. This method has an optional topics argument, that allows you to filter for only messages from topics you want. In yesterdays blog I identified that the most important 'topics' for the Didi Challenge would be the "/image_raw" and "/velodyne_points" topics. I decided to start with just the "/image_raw" topic since it would be the easiest one to verify that what i am getting seems reasonable.

import rosbag
bag = rosbag.Bag("/path/to/bag_file.bag", "r")
bg = bag.read_messages(topics=["/image_raw"])

The return value of the read_messages() method is a python generator. This is actually quite convenient, because it will allow us to load single messages at a time, as we need to process them, rather than loading the entire data from a bag file in one go, which in the case of the Didi challenge can occupy up to 11GB worth of memory.

Iterating through each item in the generator returns a tuple of three items:

topic, msg, t  = bg.next()

Printing out msg using print(msg) prints out a very long list of numbers, which looks promising, as it could be the pixel values of the image. However, it is not clear how to actually access these values from the message object since it does not actually behave like a list, or a numpy array. Probing the type reveals that it is a data type I have never encountered before.

>>> type(msg)
tmplsf1md._sensor_msgs__Image

Probing the attributes and methods available to this object using dir(msg)

>>> dir(msg)
 ...
 'data',
 'deserialize',
 'deserialize_numpy',
 'encoding',
 'header',
 'height',
 'is_bigendian',
 'serialize',
 'serialize_numpy',
 'step',
 'width'

The width and height properties looked like a promising way to verify that i did actually have image data. Running them revealed that the data is 1400x512, which does seem like a reasonable aspect ratio given the visualisation from
yesterdays blog.

>>> msg.width
1400
>>> msg.height
512

So it seemed reasonable that the image data would be stored in the msg.data attribute.

>>> msg.data
1d\x16\x1c!"\x15\x17\x16\x13\x16\x18\x12\x16\x18\x1c\.....
>>> type(msg.data)
str

However, as can be seen from the above output, the data is stored as strings, encoded in some format that is completely incomprehensible to me, but which triggered my memory about something i had briefly seen a long time ago done with numpy. I remenbered a numpy function that parses arrays from strings that had similarly formatted strings. I looked up in the numpy documentation and decided to give it a try.

a = np.fromstring(msg.data, dtype=np.uint8)

It seemed to work, so I probed its values for sanity.

>>> a.min()
4
>>> a.max()
255
>>> a.shape
(716800,)
>>> msg.width * msg.height
716800

Everything seemed quite reasonable, the only thing was that the numpy array was flat, but it did match the number of expected values. So I reshaped the array, and converted it to a PIL image to visualize it.... and voila!!!! success :)

from PIL import Image
im = a.reshape(msg.height, msg.width)
Image.fromarray(im).show()

Image of first frame extraced from Python

Processing All Images in a Bag

Now we can put all that code together into a something that will be useful when we decide to process the images.

import numpy as np
import rosbag
bag = rosbag.Bag(bag_file, "r")
messages = bag.read_messages(topics=["/image_raw"])
num_images = bag.get_message_count(topic_filters=["/image_raw"])

for i in range(num_images):
    # READ NEXT MESSAGE IN BAG
    topic, msg, t  = messages.next()

    # CONVERT MESSAGE TO A NUMPY ARRAY
    img = np.fromstring(msg.data, dtype=np.uint8)
    img = img.reshape(msg.height, msg.width)

    # DO SOME PROCESSING ON THE IMAGE    
    # ... 

Viewing Video in a Bag

Using the above code, in conjunction with opencv, we can view the video that is contained in the ROS bag.

import cv2
import numpy as np
import os
import rosbag

# SETTINGS FOR SAVING OUTPUT VIDEO
out_file = '/tmp/myVideo22.avi'  # Filepath to save the video as
fourcc_settings = cv2.VideoWriter_fourcc(*'XVID')
out_vid_dims = (1400, 512)
fps = 30  # adjust based on input video
out = cv2.VideoWriter(out_file,
                      fourcc=fourcc_settings,
                      fps=fps,
                      frameSize=out_vid_dims,
                      isColor=False)

# ROSBAG SETTINGS
bag_file_dir = "/path/to/data"
bag_file_name = 'approach_3.bag'
bag_file = os.path.join(bag_file_dir, bag_file_name)

# OPEN BAG
bag = rosbag.Bag(bag_file, "r")
messages = bag.read_messages(topics=["/image_raw"])  # Only get images data
num_images = bag.get_message_count(topic_filters=["/image_raw"])

for i in range(num_images):
    # READ NEXT MESSAGE IN BAG
    topic, msg, t = messages.next()

    # CONVERT MESSAGE TO A NUMPY ARRAY
    img = np.fromstring(msg.data, dtype=np.uint8)
    img = img.reshape(msg.height, msg.width)

    # RESIZE TO OUTPUT DIMENNSIONS
    img = cv2.resize(img, out_vid_dims)

    # SHOW THE FRAME
    cv2.imshow('VidFrame', img)

    # SAVE THE FRAME TO THE OUTPUT VIDEO
    outvid = cv2.resize(img, out_vid_dims)
    out.write(outvid)

    # QUIT IF ESCAPE BUTTON PRESSED
    k = cv2.waitKey(1) & 0xFF
    if k == 27:
        break

out.release()
cv2.destroyAllWindows()

print("DONE!")

Next Steps

Tomorrow I will do the same for the Lidar data, so see if the lidar data is also stored as point clouds, like the Kitti dataset. If it is, then I can make good use of the functions I created in the last few days here, here, and here.

Comments

Note you can comment without any login by:

  1. Typing your comment
  2. Selecting "sign up with Disqus"
  3. Then checking "I'd rather post as a guest"