Pothole Detection Robot (ROS + Gazebo + ResNet-50)

A four-wheel ground robot, simulated in ROS / Gazebo, that drives over a city-scale road network and classifies the road surface in front of it as pothole or plain road from a forward-facing camera feed using a ResNet-50 transfer-learning model.

This was my undergrad (BTech ECE) final-year capstone at SVNIT Surat (2020-2021), built with a 4-person team and submitted in May 2021. Project guide: Prof. A. H. Lalluwadia.

Why this exists

Potholes are a real safety and cost problem. In India alone, pothole-related accidents kill roughly 3,000 people a year; in the US, the American Automobile Association estimated USD 3 billion / year in vehicle damage from potholes. Most existing solutions either rely on citizen reporting (slow) or on accelerometer-based detection from a vehicle that has already hit the pothole (too late). We wanted to prototype the upstream version: a small autonomous robot that scans the road ahead with a camera and flags potholes before a vehicle drives over them.

We chose to build the entire system in simulation so we could iterate on the perception, the robot design, and the world all in one place without owning the hardware.

What I built (with team)

End-to-end the project has three layers:

Robot model. A four-wheel differential-drive bot, designed in SolidWorks, exported to URDF, and spawned in Gazebo. Onboard sensor: a single forward-facing RGB camera (Gazebo libgazebo_ros_camera.so plugin).
Simulated world. A custom Gazebo world populated with road segments, buildings, trees, street lamps and intentionally-placed pothole meshes so we could rehearse the full driving + detection loop.
Perception. A ResNet-50 CNN (pretrained on ImageNet, head fine-tuned on a Kaggle "pothole and plain road" dataset) consumes the camera frames and emits a binary classification per frame.

My personal scope on the team was the simulation and perception integration side: getting the URDF + Gazebo world wired together, exposing the camera topic, training the classifier in Colab, and writing the ROS node that calls the model on each captured frame.

Demo

The robot, in Gazebo

The simulated city world (potholes visible as brown patches on the asphalt)

Onboard camera view (left: world, right: `/bot_urdf1/camera1/image_raw`)

A full driving-and-detection video is at Video Result.mp4 (77 MB) and the project report PDF is at Project Report.pdf.

Architecture

                     +-----------------------------+
                     |        Gazebo world         |
                     |  (FinalWorldTest.world)     |
                     |  roads + pothole meshes     |
                     +--------------+--------------+
                                    |
                          spawns + simulates
                                    v
   +---------------+    +-----------+-----------+    +--------------------+
   |  cmd_vel.py   |--->|        bot_urdf1      |--->| /bot_urdf1/camera1 |
   |  (teleop)     |    |   (4-wheel diff-drive |    |    /image_raw      |
   +---------------+    |   URDF + camera link) |    +----------+---------+
                        +-----------------------+               |
                                                                v
                                                  +-----------------------+
                                                  |    Prediction.py      |
                                                  |  (ROS node)           |
                                                  |  load ResNet-50 .h5   |
                                                  |  resize 256x256       |
                                                  |  argmax -> Pothole /  |
                                                  |          Plain Road   |
                                                  +-----------------------+

ROS package

A single catkin package, bot_urdf1, holds everything:

Project Files/project_ws/src/pkb/src/bot_urdf1/
  urdf/bot_urdf1.urdf            # SolidWorks -> URDF; includes camera sensor + diff-drive plugin
  launch/gazebo.launch           # spawns the world + the bot + tf_footprint_base
  launch/display.launch          # robot_state_publisher + joint_state_publisher_gui + RViz
  launch/FinalWorldTest.world    # the city world with planted potholes
  config/bot.yaml                # joint group effort controllers (l_/r_con_position_controller)
  src/cmd_vel.py                 # keyboard teleop publishing to /cmd_vel
  src/Prediction.py              # the perception node (TF / Keras)
  src/teleop_twist_keyboard.py   # standard ROS teleop, included for convenience
  Models/, World/, meshes/, ...  # Gazebo model assets (textures, .rar / .zip source meshes)

Perception model

Training script: Project Files/Detection/pothole-detection-v2.ipynb

Backbone: ResNet-50, ImageNet weights, include_top=False, pooling='max'.
Head: Dropout(0.20) -> Dense(2048, relu) -> Dense(1024, relu) -> Dense(512, relu) -> Dense(2, softmax).
Input: 256 x 256 x 3 RGB.
Optimizer: Adam (lr=1e-5), categorical cross-entropy, ReduceLROnPlateau on val_acc.
Dataset: Kaggle "Pothole and Plain Road Images" (binary classification), 75/25 train/val split.
Inference: Prediction.py polls /bot_urdf1/camera1/image_raw, resizes to 256 x 256, runs model.predict, and prints Pothole Detected or Plain Road.

The report also compares ResNet-34 vs Inception-V3 as alternatives (Chapter 5, Table 5.3) before settling on the ResNet family for accuracy-per-parameter reasons.

Quickstart

This project was developed on ROS Melodic / Ubuntu 18.04 with Gazebo 9. It should run on Noetic / Ubuntu 20.04 with minor changes (Python 3 print syntax, the y_pred = True typo in Prediction.py).

Prereqs:

sudo apt install ros-melodic-desktop-full \
                 ros-melodic-joint-state-publisher-gui \
                 ros-melodic-effort-controllers \
                 ros-melodic-gazebo-ros-pkgs
pip install tensorflow keras opencv-python numpy

Build the workspace:

cd "Project Files/project_ws"
catkin_make
source devel/setup.bash

Launch the simulated world + spawn the robot:

roslaunch bot_urdf1 gazebo.launch

In a second terminal, drive the bot with the keyboard:

rosrun bot_urdf1 cmd_vel.py
# or the standard ROS teleop:
rosrun bot_urdf1 teleop_twist_keyboard.py

In a third terminal, run the perception node (expects model_1.h5 in cwd):

rosrun bot_urdf1 Prediction.py

To visualize in RViz instead of Gazebo:

roslaunch bot_urdf1 display.launch gui:=True

Tech stack

Layer	Choice
Robotics OS	ROS Melodic
Simulator	Gazebo 9 (`gazebo_ros`, `libgazebo_ros_camera.so`)
Robot model	SolidWorks -> URDF, diff-drive, single RGB camera
Control	`effort_controllers/JointGroupEffortController`
Perception	TensorFlow / Keras, ResNet-50 transfer learning
Vision utils	OpenCV (`cv2.resize`, image read)
Language	Python 2.7 (ROS nodes), Python 3 (training notebook)
Build	catkin

Results

The robot drives reliably under teleop through the simulated world.
The classifier was trained for 50 epochs on the Kaggle dataset; accuracy/loss curves and per-image predictions are in Chapter 6 of the project report. We have not re-verified those numbers since 2021, so I am not quoting a headline accuracy here on purpose.
End-to-end loop (drive -> capture frame -> classify -> print label) works in simulation. The recorded run is in Video Result.mp4.

Known limitations / future work (honest list)

Classification, not detection. The model gives a whole-frame binary label, not a bounding box or a depth-localized pothole position. The next step would be a Faster R-CNN / YOLO-style detector, which the report scopes out in Chapter 2.6.
Inference loop in Prediction.py is shell-based (os.system('rosrun image_view image_saver ...')) rather than subscribing directly to the camera topic via cv_bridge. That was a pragmatic shortcut to ship; the right fix is a proper subscriber.
Single camera, no depth. A stereo pair or a depth sensor would let us localize potholes in 3D rather than just flag their presence.
Sim-only. Never deployed to a physical bot.
There is a Python typo in Prediction.py (y_pred = True should be ==) that I haven't fixed in-place to preserve the original submission state. Worth fixing on a port to Noetic.

Repo layout (top level)

.
├── Project Files/
│   ├── Detection/          # training notebook + standalone OpenCV/Keras script
│   ├── World/              # Gazebo screenshots + recorded run video
│   └── project_ws/         # catkin workspace (src + cached build/devel)
│       └── src/pkb/src/bot_urdf1/   # the ROS package
├── Project Report.pdf      # 56-page report, ECED SVNIT, May 2021
├── Project Presentation.pptx
├── Video Result.mp4        # end-to-end demo
├── media/                  # README screenshots
├── LICENSE
└── README.md

Authors

Team (BTech ECE, SVNIT Surat, May 2021):

Dhruvil Parikh (U17EC153), me; simulation + perception integration
Pankaj Kumar Vijayvergiya (U17EC122)
Prakash Saini (U17EC151)
Sarvesh Dubey (U17EC152)

Guide: Prof. A. H. Lalluwadia, ECED, SVNIT Surat.

Curated and re-published here as part of my GitHub portfolio: github.com/dparikh79 · dparikh79.github.io

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pothole Detection Robot (ROS + Gazebo + ResNet-50)

Why this exists

What I built (with team)

Demo

The robot, in Gazebo

The simulated city world (potholes visible as brown patches on the asphalt)

Onboard camera view (left: world, right: `/bot_urdf1/camera1/image_raw`)

Architecture

ROS package

Perception model

Quickstart

Tech stack

Results

Known limitations / future work (honest list)

Repo layout (top level)

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Project Files		Project Files
media		media
.gitignore		.gitignore
LICENSE		LICENSE
Project Presentation.pptx		Project Presentation.pptx
Project Report.pdf		Project Report.pdf
README.md		README.md
Video Result.mp4		Video Result.mp4

Folders and files

Latest commit

History

Repository files navigation

Pothole Detection Robot (ROS + Gazebo + ResNet-50)

Why this exists

What I built (with team)

Demo

The robot, in Gazebo

The simulated city world (potholes visible as brown patches on the asphalt)

Onboard camera view (left: world, right: /bot_urdf1/camera1/image_raw)

Architecture

ROS package

Perception model

Quickstart

Tech stack

Results

Known limitations / future work (honest list)

Repo layout (top level)

Authors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Onboard camera view (left: world, right: `/bot_urdf1/camera1/image_raw`)

Packages