CDE4301, ASI-404

We would like to sincerely thank our project supervisors, Mr Nicholas and Graham, for their guidance, encouragement, and confidence in our team throughout this module. Their mentorship has been a key factor in our success, and their trust in allowing us the freedom to explore and make decisions independently has greatly enriched our learning experiences. Our supervisor’s willingness to share their knowledge and provide constant support has been invaluable, and their dedication to our growth has made a lasting impact on our project journey.

Our gratitude also goes to the workshop technicians, Mr. Hamilton and Mr Dickson for their technical expertise and continuous assistance in ensuring our design was optimized for effective manufacturing.

We would also like to thank Ms. Annie for her efforts in managing procurement and ensuring that essential materials were acquired smoothly and on time alongside Yee Teng and Patrick for their cooperation and support in facilitating our project within the Innovation and Design Hub and providing us with valuable insights into the Hub's operations.

Special thanks to Professor Goh Cher Hiang for reviewing our project and offering constructive feedback that helped us improve further.

Finally, we extend our appreciation to EDIC for their generous funding support, which has been instrumental in bringing our project to fruition.

1 Introduction

This project is on developing a mobile manipulator for NUS’s Innovation & Design Hub to relieve staff workload. It integrates mechanical, electrical, and software design to create a unified robot system with a user-friendly GUI control and seamless system integration. Note that this report is on the overall development, integration and validation of the mobile manipulator system, with more focus on the technical development. The interim report, which has more details on the problem analysis, value proposition and initial analysis can be found in the interim report. Link to Interim Report

1.1 Context

Robots are increasingly used to relieve labour shortages and handle repetitive tasks across sectors like warehousing, healthcare, and hospitality. This is accelerating a shift from factory-only deployment to human-centric environments, with robots taking on point-to-point delivery and routine workflows as seen in figure 1.

The team had an interest in knowing where the National University of Singapore can benefit from such a system. To explore this, the team decided to choose the Innovation and Design Hub as the testing ground.

1.2 Location and stakeholders

The Hub was chosen as a location of interest as it is one of the largest workspaces within NUS with high student traffic from various disciplines to use the facilities to do discussions, fabrication or tinker. The floor plan of the space is shown in figure 2.

The project involves the Hub Technical Staff seen in figure 3, How Yee Teng (primary) and Patrick Tan Shoong Chin (secondary), as well as the Hub Technical Assistants. To gather insights, data collection through surveys and observations helped the team to understand their roles and contributions as well as understanding what problems the staff faced on the ground.

1.3 Problem Analysis and Problem Statement

To ascertain the problems further, we drafted up a problem identification table utilising the 5W1H template in Figure 4.

From our previous presentation, the problem statement defined by the team was stated that several redundant tasks are slowing down the work of the Hub staff and reducing efficiency in the Hub during the school semester. There is a need to streamline or automate the tasks.

Our design statement then for the project:

To build a system that is mobile, to travel around the Hub’s environment, can manipulate and interact with the Hub’s objects and is autonomous, the staff do not need to act as a middleman.

1.4 Value Proposition

To assist in handling tasks that cannot be handled by software changes due to the physical nature of interaction, a physical solution can be created that will be able to aid the staff in handling work as proposed in the Value Proposition Canvas in Figure 5.

The team will develop an autonomous mobile manipulator that combines navigation and manipulation: it can move through the Hub, pick and handle objects, and deliver them to staff—without requiring constant monitoring or extra devices.

1.5 Subsystems to develop

1.5.1 Design Specifications

Based on the derived problem statements and key problems identified by the staff, the design specifications were prepared. Given our limited manpower, budget of $1600 SGD and time for development of less than a year, we decided to scope down to 2 distinct tasks that the system must achieve that assists with the staff.

For task 1, the robot is to grab tools from the pegboard shown in Figure 6, put it into an onboard basket and deliver it to a table. End-effector system with tool changer capabilities enables picking tools of different geometries.

For task 2, for the staff to control and monitor signs onboard for the system to function, a user interface is to create that can link commands from the user to the relevant commands for the hardware to perform the task. A sequence diagram for the user interface is shown in Figure 7.

1.5.2 Subsystem Identification

Based on the taskings that the robot must accomplish, we have identified the key subsystems that need to be developed to bring the system to a minimum viable product, shown in figures 8 and 9.

1.6 Key changes from interim

Previously, a task 2, point – to – point delivery was to be implemented but due to time, cost and manpower constraints on the development, this was shelved as a potential future works addition to the system.

The system will be able to deliver items around the hub but will not be able to return then back, the mechanical complexity is to be scaled back to allow for more development time towards the software and main hardware deliverables.

To complement the pick and place operations such that there is consistent feedback to the system, the team opted to implement computer vision to the system with depth data, to guide the system to pick up the tools with high precision and repeatability.

1.7 Work Allocation

The updated work allocation amongst the team members is shown in the table below.

Team Members	Scope of work
Jason	Tool Changer Subsystem Mechanical Design Electronics Design Firmware End Effector Subsystem Mechanical Design Electronics Design Firmware Software Subsystem Cobot control node AMR control node CAN Bus control node Web UI/User input node Control/Sequencer node
Fang Chen	Robot Structure Subsystem Mechanical Design Electrical Design Computer Vision Subsystem Model training CV Processing

Team Members

Scope of work

Jason

Tool Changer Subsystem

Mechanical Design
Electronics Design
Firmware

End Effector Subsystem

Mechanical Design
Electronics Design
Firmware

Software Subsystem

Cobot control node
AMR control node
CAN Bus control node
Web UI/User input node
Control/Sequencer node

Fang Chen

Robot Structure Subsystem

Mechanical Design
Electrical Design

Computer Vision Subsystem

Model training
CV Processing

2. Robot Structure

This chapter covers the development of the robot structure subsystem, including both the mechanical and electrical aspects of the system. The subsystem was designed to provide a stable and modular platform for mounting the robot arm, integrating the required electronics, and supporting safe and practical operation within the intended environment.

2.1 Mechanical

The mechanical subsystem focuses on designing a robot structure that can securely mount the robot arm while remaining lightweight, modular, and capable of housing all required components.

2.1.1 Recap of Semester 1 Developments

Figure 10 below shows the progress of the frame structure from the interim presentation period.

The robot structure was designed with consideration of the operational environment of the Hub, the payload capacity of the AMR, and the AMR’s permissible center of gravity as shown in figure 11.

In addition to these considerations, the structure was designed to fulfil several key functional requirements as shown in the figure 12.

Prior to implementing the design, a total of five design iterations were developed and evaluated using a concept selection matrix (figure 13 and 14). A detailed comparison of these design iterations is provided in Appendix D1-D6.

From the concept selection matrix, the best performing configuration was identified and taken forward for detailed design. The key design features of the implemented structure are described in the following paragraphs below.

First, the height and layout of the structure was designed to ensure that the robot arm can effectively reach the required workspaces while maintaining overall system stability as seen in figure 15.

Second, the frame was optimized to be lightweight, while ensuring ease of assembly (Figure 16).

Third, the frame allows for easy future modifications, where additional profiles can be added for reinforcement, and the robot arm position can be adjusted to rebalance the centre of gravity without major redesign. (Figure 17).

Fourth, a vertical panel with DIN rail mounting was implemented to organise wiring and electronic components. The segregation of control and power panels improves accessibility and simplifies troubleshooting. (Figure 18)

After the structure was designed, structural evaluations such as CG analysis and static analysis (Shown in Figure 19) were also conducted to validate the stability of the system. Explanation for how analysis was done can be found in appendix D7.

Overall, the interim work established a stable and well-integrated mechanical platform, forming the foundation for further system development in Semester 2.

2.1.2 Semester 2 Developments

While provisions for additional features were considered during the initial design phase, their detailed design and physical implementation were carried out in Semester 2. The key features implemented in semester 2 are described below.

1. Interface Panel and Emergency Stop

An interface panel with push buttons was implemented to enable quick execution of predefined commands. Two E-stop buttons were installed at accessible locations on the robot. These E-stops are integrated with both the robot arm and the AMR, ensuring that both systems stop simultaneously when activated. These can be seen in Figure 20.

2. Battery Box

A battery box is required to safely house the battery. Initially, we designed a quick-release sliding battery box (see appendix D8) so that the battery can be conveniently connected and disconnected to the robot power system in a single action. However, due to time constraints, a commercial Pelican case was adopted as a practical alternative seen in figure 21.

3. Adjustable Monitor and External connection ports

An adjustable monitor and external connector ports (HDMI, USB and LAN ports) were integrated into the system to facilitate easy debugging and system monitoring. (Figure 22).

4. Cooling System

Cooling fans were installed to improve airflow within the enclosure. One fan supplies air to the controller’s intake vent, while another expels hot air at the exhaust vent. The intake fan is positioned at the bottom and the exhaust fan at the top to utilise natural convection, as shown in figure 23.

5. Storage Unit

A compartment was allocated to be a storage unit as seen in figure 24 for miscellaneous components such as tools and accessories, to make things more convenient for users.

6. External Covers

Protective external panels and 3D printed corner covers as shown in figure 25 were fabricated from acrylic and PETG respectively, improving both safety and aesthetics. PETG was chosen over PLA for its superior impact resistance.

With the addition of new components to the robot, the total payload was reassessed to ensure that the payload of the AMR remained within the 100 kg capacity. The components were weighed, and the robot structure was found to have a total mass of 87.24 kg (Figure 26). Accounting for a +5% tolerance to include screws and other miscellaneous items, the estimated total mass increases to 91.61 kg. This verification confirms that the final system remains within the allowable payload limits, ensuring safe operation. Figure 26 shows the components weighed for the final system.

The video below shows the final robot structure with all the components integrated.

2.2 Electrical

The electrical subsystem supports power delivery, system protection, onboard computation, and auxiliary electronics integration. Building on the baseline architecture established earlier, the final design refined the power distribution and safety systems to support reliable robot operation.

2.2.1 Final Circuit Design

Figure 27 shows the finalised circuit diagram of the robot structure. As outlined in the interim report, the AMR provides a maximum output of 48 V at 20 A (~960 W), which is insufficient to reliably support the robot arm and its controller, specified at approximately 1 kVA (on continuous operation). To address this, a dual-power architecture was adopted - the robot arm is powered by a dedicated 72 V, 1152 Wh battery pack via an inverter, while auxiliary components such as the onboard computer and networking devices are powered directly from the AMR. This architecture ensures reliable operation and sufficient power availability.

Building on this baseline architecture, the electrical subsystem was further refined through the integration of additional components required for full system functionality. Key additions include the robot-side and tool-side changer boards, which enable electrical interfacing for the tool changer mechanism, as well as the tool changer electromagnet for end-effector operation (section 3.3). In addition, two DC fans were introduced to provide active cooling for improved thermal management of the enclosed electronics, and a mini monitor was integrated to facilitate system debugging.

2.2.2 Centralised E-stop System

As shown in the interim report, the E-stop circuit enables simultaneous shutdown of both the robot arm and the AMR. This is necessary because the two systems currently operate with separate E-stop circuits.

As described in Section 2.1.2, two E-stop buttons were implemented: one located on the interface panel at the rear of the robot, and the other positioned at the front for easy access. This configuration improves accessibility and ensures that the robot can be stopped quickly from multiple operating positions. This circuit is shown in figure 28.

A video showing the E-stop testing is referenced in Section 6.1 on edge case testing and safety considerations.

2.2.3 Battery Performance and Operational Considerations

Following the interim review, a key consideration was the operational runtime of the system from a user perspective. Based on testing, the battery pack powering the robot arm provides approximately 6 hours of operation before low-voltage conditions of around 68.5 V lead to insufficient power warnings on the arm. A full recharge to nominal voltage of about 82 V takes approximately 1 hour.

The AMR, as specified by the manufacturer, provides a runtime of approximately 8 to 10 hours, with a charging time of about 2.5 to 3 hours, and roughly 1.5 hours for 0 to 80% charging. Since the AMR powers only auxiliary components in the final architecture, its runtime is not the limiting factor.

In the context of the Hub’s operating hours from 8.30 am to 6 pm, which is about 9.5 hours, the robot arm battery is the primary constraint on continuous usage. However, the short charging time allows the system to be recharged during low-usage periods, enabling continued operation throughout the day with minimal downtime.

Overall, the power setup balances runtime, charging time, and flexibility, allowing the system to remain practical and usable in its intended environment.

3: End effector & tool changer systems

3.1 Interim developments

To design the End Effectors for the system, design constraints must be established in terms of the actuation mode of choice, overall maximum size and the maximum mass before further analysis on the approach can be done.

From the interim presentation, the actuation mode of the end effector and tool changer were confirmed after analysing the constaints and ranking the different actuation modes based on the concept screening technique as shown in the table below. Electric actuation was selected.

		Concepts
Selection Criteria	Weight	Pneumatic		Hydraulic		Electric
Selection Criteria	Weight	Rating	Weighted Score	Rating	Weighted Score	Rating	Weighted Score
Force Output	15%	3	0.45	4	0.6	3	0.45
Actuation	5%	4	0.2	4	0.2	3	0.15
Noise	20%	2	0.4	2	0.4	4	0.8
Maintenance	20%	3	0.6	2	0.4	4	0.8
Hardware Requirements	40%	2	0.8	2	0.8	4	1.6
Total Score		2.45		2.4		3.8
Rank		2		1		3
Continue?		No		No		Yes

From this and knowing the dimensional constraints of the mounting, the design specifications for the end effector and tool changer system were drafted and are shown in the table below.

Design Specifications
Overall system	Dimensional Requirement	165 x 230 x 135 (L x W x H)
	Maximum Mass	2kg
	Actuation Mode	Electric
	Cost	<500 SGD
Grippers	Max gripper length	130mm
	Maximum Mass	1kg
	Control Mode	Open/Closed Loop
	Minimum Gripping force	1kg
	Payload support	1 - 3kg
Tool Changer System	Maximum Mass	1kg
	Max supporting payload mass	10kg (Safety Factor of 2)
	Holds gripper system under power loss
	Fully controllable by robot system
	Able to transmit power and signal to gripper

Due to the wide variety of tools that the robot is required to grasp, careful consideration had to be given to the final gripper designs. An additional factor was the storage fixtures used for the tools. Feedback from stakeholders indicated that the existing tool holders were not to be modified, making them a key design constraint. Although compliant gripper systems can handle a broad range of object geometries, the conditions imposed by the pegboard storage arrangement made the development of a truly universal end effector impractical. As a result, the team elected to develop a system that allows the robot to automatically exchange end effectors according to the tool being handled, thereby enabling more reliable and optimised pickup performance across all tool types.

A prototype end effector was developed, which was a rack-and-pinion gripper mechanism with swappable fingers. This design was chosen primarily due to being straightforward to implement as shown in the figure 29 for the concept screening, while also aligning with the direction validated during the interim presentation. The prototype gripper can be seen in figure 30.

Non linear dynamic analysis were also performed on the compliant gripper fingers as seen in the figure 31 to evaluate the deformations when in operation. These results were then validated during grasping tests with various tools.

Concepts were also evaluated for the tool changer mechanism as shown in figure 32, with the utilisation of electromagnets being selected as the approach. Tests were conducted to validate the electromagnet strength with a selected steel plate of certain thickness and were successfully completed as shown in the figure 33. The exact calculations for the thickness of steel is shown in Appendix E.

Following the validation of these concepts, the next sections will cover the detailed design and development of the finalised end effector and tool changer prototype systems, including the mechanical design, electrical integration, and control strategies implemented to achieve the required functionality.

3.2 End effector

3.2.1 Mechanical Development

Regarding the mechanical design of the end effector, the body was redesigned to further optimise its overall size, with particular emphasis placed on enclosing the mechanism’s dynamic components, namely the rack-and-pinion assembly. This was done to reduce the likelihood of pinch injuries should a user place their fingers near the moving regions of the mechanism during operation. Figure 34 above shows the exploded view of the gripper while figure 35 below shows the comparison of the initial prototype and the final prototype, where the size reduction and improved aesthetics of the final design can be observed.

Most critically, the gripper attachment interface was redesigned to incorporate a universal mounting feature, allowing direct attachment to the slave side of the tool changer. This modification was essential in enabling automatic end effector exchange within the overall system. Further details of this interface are presented in Section 3.3.1.

The method of swapping gripper fingers was also improved, such that each finger could be secured to the rack using only two M3 × 25 mm screws as seen in figure 36. This reduced the complexity of finger replacement and improved maintainability during testing.

Following successful preliminary evaluation of the compliant gripper finger design, this concept was retained in the updated iteration. Experimental testing showed that the compliant fingers were able to successfully grasp screwdrivers, wrenches, and pliers. However, due to the mounting geometry and handling requirements of the mallet, a different gripper finger configuration was required to achieve reliable pickup. For this tool, the selected approach involved lifting the mallet while maintaining its orientation throughout the transfer process until deposition, thereby ensuring stable handling. The different gripper options are shown in figure 37.

3D Model of the Gripper

3.2.2 Electrical Development

For the electrical subsystem, the DS3218MG servo motor was selected primarily for its suitable torque characteristics for the intended gripping application. As the servo has a maximum rated input voltage of 6.4 V, an LM2596 buck converter was incorporated to step down the available 12 V supply to an appropriate operating voltage. At the same time, retaining a 12 V supply rail preserves the option of integrating higher-torque actuators in future revisions, such as the Dynamixel AX-12A, should greater actuation capability be required.

To control the opening and closing motions of the gripper, an STM32F103C8T6 microcontroller was used. The tool-side circuit board, which formed part of the tool changer electronics, was designed to house the control circuitry and embedded firmware responsible for end effector actuation. Further details of this implementation are provided in the following section.

Figure 38 shows the electrical block diagram of the gripper system.

3.2.3 Integration and Testing

The integration of the electrical subsystem with the mechanical components was a critical phase in the development process. This involved ensuring that all electrical connections were properly made and that the control circuitry functioned as intended. Initial testing focused on validating the functionality of the gripper mechanism, including the opening and closing motions on the specified tools for pickup. Figure 39 above showcases the grippers undergoing trials. The videos below show the testing of the gripper with the compliant fingers and the mallet configuration, demonstrating successful grasping and handling of the tools.

In terms of pick reliability and repeatability, the gripper was able to achieve consistent performance across multiple trials with the compliant fingers for screwdrivers. The mallet configuration also showed reliable handling, although the specific geometry of the mallet required careful positioning to ensure a secure grip. Overall, the integrated end effector system met the functional requirements for tool handling within the intended application context.

However, for the compliant fingers, the gripper was not able to achieve a reliable grasp on the screwdrivers when the gripper is offset from the ideal grasping position. This is because the compliant fingers rely on deformation to conform to the tool geometry, and significant offsets can lead to insufficient contact and grip strength. It is because of this that the computer vision subsystem was implemented to provide feedback on the gripper position relative to the target tool, allowing for adjustments to be made to ensure a successful grasp. Future iterations of the gripper design could also explore modifications to the finger geometry or material properties to improve robustness against positional offsets. Figure 40 shows some examples of failed grasps.

3.3 Tool Changer

3.3.1 Electrical Development

As the electrical subsystem of the tool changer imposed several constraints on the overall mechanical implementation, it is addressed first in this section.

Evaluation of the available computing hardware, particularly the UNO-238 V2 Industrial Mini PC, showed that the accessible interfaces were limited to 3.3 V GPIO logic and CAN Bus 2.0B. CAN Bus was selected over conventional serial communication because its differential signalling offers greater robustness against electromagnetic interference. This was especially important as communication lines had to pass through the robot arm structure, where the cobot’s AC servo drives could introduce electrical noise.

Based on the available system peripherals, the power requirements of the end effector prototype, and the results obtained from preliminary testing of the 24 V electromagnet, a set of design deliverables was established for the control board.

Key requirements for the circuitry
- Must have CAN Bus 2.0B communication onboard to facilitate communication with the ROS2 control stack via the MiniPC’s CAN Bus peripherals.
- Must be able to switch on and off the electromagnet (Which operates at 24V) via commands received from the CAN bus protocol
- Must be able to control power delivery to the end effector, to prevent live 12V DC connections at the output when no end effector is connected for safety purposes.

To further reduce bill-of-materials cost, the board was designed to operate as either a master or slave through configuration. In the master role, it is mounted on the cobot side to handle power delivery and control the tool changer release mechanism. In the slave role, it is mounted on the end effector side to receive system commands and control the actuators and sensors of the attached tool. This reduced the number of unique board designs required and simplified manufacturing and integration.

After evaluating the range of microcontrollers available on the market, the STM32F103 was selected because it satisfied the functional requirements of the system while maintaining relatively low cost and manageable design complexity. The device provides sufficient digital I/O and integrated peripherals for the intended application, including CAN communication, UART communication, timer functions, and general-purpose control signals.

Based on the system requirements, an interaction architecture shown in figure 41 was then defined between the control boards and the Mini PC, such that the Mini PC serves as the primary computation and supervisory control unit, while the control boards handle local peripheral interfacing, power switching, and actuator-level control.

For the CAN Bus communication circuit, the SN65HVD232 (Figure 42) transceiver was selected due to its compatibility with 3.3 V microcontroller logic and its suitability for direct integration with the STM32-based control architecture. In addition, the device is widely used in embedded CAN applications and is supported by substantial reference material and implementation examples, which reduced development risk and simplified hardware validation.

Following the interim presentation, the selected electromagnet required a 24 V supply, while the gripper subsystem only required 12 V operation. As the STM32F103 operates using 3.3 V logic, dedicated switching circuits were required to allow the low-voltage control signals from the microcontroller to safely and reliably switch the higher-voltage loads.

For the 24 V electromagnet, a low-side switching circuit based on an N-channel MOSFET was implemented as shown in figure 43. In this topology, the MOSFET is positioned between the load and ground, such that enabling the MOSFET completes the current path through the electromagnet. Since the STM32 microcontroller operates at 3.3 V logic, its GPIO output is insufficient to directly provide the higher gate drive voltage required to fully enhance the selected N-channel MOSFET. A transistor driver stage consisting of an NPN transistor and a PNP transistor was therefore used as a level-shifting interface.

The 3.3 V GPIO signal first drives the NPN transistor, which then controls the PNP transistor connected to the 12 V rail. When activated, the PNP transistor pulls the MOSFET gate toward 12 V, thereby providing a substantially higher gate-to-source voltage than could be achieved directly from the microcontroller output. This ensures full enhancement of the MOSFET, resulting in a lower R_DS(on) and reduced power dissipation. A flyback diode was connected across the electromagnet to clamp the reverse voltage generated during turn-off, protecting the switching device against inductive overvoltage.

For switching the 12 V supply to the end effector, a high-side switching circuit based on a P-channel MOSFET was used as shown in figure 44. In this case, the MOSFET is placed on the positive supply rail, allowing the load to be disconnected from the 12 V source when required. Because the source terminal of the P-channel MOSFET sits near 12 V during operation, the microcontroller’s 3.3 V logic output alone is insufficient to directly pull the gate to the required level for effective switching. To address this, an NPN transistor was introduced to pull the MOSFET gate toward ground when commanded, producing a sufficiently negative V_GS to turn the P-channel MOSFET on. When the transistor is turned off, the gate is pulled back toward the source potential, turning the MOSFET off. This arrangement allowed the microcontroller to control the 12 V output rail safely using its 3.3 V logic levels.

To enable repeated coupling and decoupling between the robot-side and tool-side control boards, a pogo-pin contact interface was adopted. Spring-loaded pogo pins such as the ones shown in figure 45 below were placed on the tool side, with matching pogo pads on the robot side. Appendix F10 and F11 provide the specifications of the pogo pins used.

From these requirements, a hardware block diagram seen in figure 46 was developed to define the major functional sections of the circuit board and to map the selected components to each subsystem requirement.

Considering the tight dimensional constraints imposed by the mechanical design of the tool changing system, the control board dimensions were fixed at 30 mm × 40 mm × 10 mm in length, width, and height respectively. Under these constraints, the use of off-the-shelf development boards, prebuilt modules, or prototype perfboard construction was not suitable for the deployed prototype. As a result, a custom printed circuit board was designed in KiCad, based on circuit simulation, calculation, and validation carried out using LTspice and SMath Solver.

The board development was done iteratively, starting with the design and testing of individual circuit sections such as the CAN transceiver interface, the MOSFET switching circuits, and the microcontroller firmware. After validating each section independently, the full circuit was integrated into a single board design, which was then fabricated and assembled for final testing and deployment. Figure 47 shows the iterative prototypes developed during the design process, leading up to the final integrated control board design. See Appendix F13 to F16 for the prototype iterations in more detail.

The final control board integrates the STM32F103C8T6 microcontroller, the low-side and high-side switching circuits, and an onboard CAN Bus 2.0B transceiver into a single compact design. In the tool-side configuration, the board additionally supports up to six timer-based input/output channels, allowing it to control end effectors with as many as six servos. Figure 48 and Figure 49 below showcase the key components on the board. For the selection of components and calculations, see Appendix F1.

3D Model of the Tool Changer PCB

Appendix F2 to F8 goes through the board layers in more detail.

To validate the fabricated control boards, firmware was first uploaded through the onboard debugger interface, followed by LED tests to confirm basic microcontroller operation. Communication tests were then carried out to verify the CAN circuitry by transmitting signals from the robot-side board to the tool-side board, with onboard debugging LEDs used as visual indicators of successful data exchange. Finally, the MOSFET and transistor switching circuits were tested to confirm correct control of the 12 V supply and the electromagnet, both of which were successfully validated. Figure 50 below shows the testing of the control boards.

Upon completion of these electrical tests, the boards were integrated into the mechanical housings for the Robot-side and Tool-side assemblies, as seen in figure 51 below. More details on the mechanical design of the housings are provided in the following sections.

3.3.2 Mechanical Development

Regarding the mechanical design of the tool changing system, 3 key portions of the system need to be developed, namely:

- The Robot Side unit

- The End Effector Side unit

- The Tool Holder

3.3.3 The Robot Side Unit

The Robot side tool unit shown in figure 52 is meant to house the electromagnet and corresponding circuit board, with a major consideration for the alignment of the Tool side unit when docking. Successful, repeatable alignment of the tool side unit is crucial for both mechanical and electrical connections.

The docking interface was designed with chamfered alignment features to improve passive self-alignment between the robot-side and tool-side units. Tapered ends were applied to the tool-side alignment pins, while chamfered entry features were incorporated into the mating holes on the robot-side interface. The purpose of these lead-in geometries was to compensate for minor positional errors in the robot approach by guiding the mating parts into alignment during contact.

As shown in figure 53, the design reduces sensitivity to small lateral misalignments by converting initial contact forces into corrective guiding forces, allowing the interface to self-center before full engagement of the electromagnetic and electrical connection surfaces. As a result, the mechanism achieves more reliable and repeatable docking, while also reducing the risk of collision, jamming, and localized wear at the mating interface. The selected chamfer geometry was sufficient to tolerate approximately ±1 mm of positioning error during insertion. Figure 54 shows the free body diagram of the alignment pins working, where the forces generated from the chamfered features guide the pins into the holes for proper alignment.

The mounting of the camera was done on the robot side for the detection of the tools and assisting in grabbing. Further elaboration on the exact software is shown in section 4. In terms of the mechanical mounting, consideration was given to the camera’s RGB and Depth FOV as seen in figure 55, to ensure that the camera image was not blocked when the Tool side with the end effector was docked together.

3.3.4 The End Effector Side Unit

For the tool side unit shown in figure 56, the primary design considerations involve the mounting of the corresponding steel plate, of which the validation was done from the interim presentation period, with the calculations seen in appendix E alongside the placement of the dowel pins. The corresponding tool changer circuit system is mounted here as well.

Figure 57 shows the mounting holes that are placed at the underside of the holder for customisability in gripper placements and options. Additionally, the edges of the holder are designed to double function as the supports on the tool holder.

3D Model of the Tool Changer Unit

3D Model of the Tool Changer Unit with end effector attached

3.3.5 The Tool Holder

The tool changer holder shown in figure 58 was created with the following considerations on the robot's physical approach for docking and undocking alongside the part complexity.

The tool pickup strategy was intentionally designed as a vertical lift followed by linear forward clearance, rather than using an angled holder. This approach decouples load transfer from disengagement, resulting in more predictable tool extraction, reduced sideloading, improved repeatability, and simpler tolerance management. While an angled holder may appear to assist release geometrically, it would also introduce greater dependence on precise alignment and could increase the risk of rubbing, wedging, and inconsistent extraction. The selected method therefore offered a more robust and controllable solution for reliable autonomous tool exchange.

In this application, robustness and repeatability were prioritized over passive geometric assistance. The chosen holder geometry ensures the robot executes a deterministic extraction sequence that is easier to validate and better suited to autonomous operation. The docking and undocking sequence is shown in more detail in the figures 59 and 60 below, where the vertical lift and forward clearance can be observed.

The retaining “fingers” of the tool holder were intentionally designed to constrain the tool in the X and Y directions. This was necessary because the end effector behaves as a separate rigid body with respect to the main robotic platform. During AMR motion, especially under sharp angular turning, inertial loading acting on the gripper can generate lateral displacement relative to the holder. Without sufficient constraint, this could result in unintended disengagement of the tool from its storage interface. The addition of X- and Y-axis constraint therefore improves retention security during dynamic operation as shown in figure 61.

The basket unit as seen in figure 62 was designed to be mounted to the tool holder body, to allow for modularity should the basket be damaged or require replacement. The basket was designed to be large enough to accommodate the tools while also being compact to minimise the overall footprint of the storage system.

3D Model of the full Tool Changer Assembly

3.4 Overall integration and testing of tool changer and end effector

After the individual development of the end effector and tool changer subsystems, the final phase of this section involved the integration of all components into a cohesive system as shown in figure 63, followed by comprehensive testing to validate functionality and performance. This process included mechanical assembly, electrical wiring, firmware deployment, and system-level validation through a series of test scenarios designed to evaluate the reliability and effectiveness of the tool changing mechanism under realistic operating conditions.

The tool changing system was subjected to a range of tests, including repeated docking and undocking cycles to assess the durability and reliability of the mechanical interfaces, as well as functional tests to confirm the correct operation of the electrical control circuits and the successful actuation of the gripper mechanism as shown in figure 64 below.

The videos below demonstrate key aspects of the integrated system in action, showcasing the tool changer’s ability to autonomously exchange end effectors and transmit data. The video showcasing CAN Bus transmission code is in section 5, under the software control stack. Over 300 docking/undocking cycles were completed and only 2 failures are recorded, due to firmware bugs that were subsequently fixed.

The videos below show the testing of the toolchanging system working to perform a pick and place operation. This test was hardcoded with no feedback data.

When referencing the specification table on the size, mass and pick capabilities of the system, the system meets the requirements as follows:

Design Specifications			Requirement met
Overall system	Dimensional Requirement	165 x 230 x 135 (L x W x H)	YES
	Maximum Mass	2kg	YES
	Actuation Mode	Electric	YES
	Cost	<500 SGD	YES
Grippers	Max gripper length	130mm	YES
	Maximum Mass	1kg	YES
	Control Mode	Open/Closed Loop	YES - OPEN LOOP
	Minimum Gripping force	1kg	YES
	Payload support	1 - 3kg	NO
Tool Changer System	Maximum Mass	1kg	YES
	Max supporting payload mass	10kg (Safety Factor of 2)	NO
	Holds gripper system under power loss		YES
	Fully controllable by robot system		YES
	Able to transmit power and signal to gripper		YES

While most of the requirements were met, the system was not able to meet the payload support requirement for both the gripper and tool changer subsystem. The gripper was only able to reliably grasp up to 1kg of payload, while the tool changer was only able to support up to 5kg of payload before failure. This was due to limitations in the strength of the electromagnet used in the tool changer, as well as limitations in the mechanical design of the gripper and tool holder. Future iterations of the design could explore stronger electromagnets, improved gripper designs, and more robust tool holder geometries to increase the maximum payload capacity of the system.

4. Computer Vision Subsystem

4.1 Overview

Figure 65 shows the pegboard setup, which is divided into three sections, location 1, 2, and 3. Each section is designated for a specific category of tools - location 1 stores Phillips screwdrivers, location 2 stores flathead screwdrivers, and location 3 stores a mallet

The corresponding tool classes are defined as follows in figure 66:

The user will first input a tool name into the system. Based on the user’s selected tool type, the robot arm first moves to the designated location. The camera is then activated to capture an image of the scene. Object detection is performed using YOLOv11n, after which the 3D coordinates (XYZ) of the closest instance of the selected tool are computed relative to the camera frame. These coordinates are then transformed into the robot arm frame to guide the robot arm for grasping (frame transformation is covered in section 5.5). A visual illustration of this process is shown below in figure 67.

Due to time constraints, the project scope was narrowed down based on the following assumptions:

1. The tools are always positioned upright. While some deviations from vertical alignment are acceptable, they are not exaggeratedly slanted, messy or occluded.

2. The screwdrivers’ arrangements are always consistent. For example, large flatheads (PH–L) will always be placed at the 2 leftmost holders, medium flatheads (PH–M) will be placed at the middle 2 holders, small flatheads (PH-S) will be placed at the 2 rightmost holders (Figure 68).

Based on our observations and feedback from the Hub staff, these assumptions are reasonable as the tool holders are already sized according to fit specific screwdriver sizes. As a result, users tend to return tools to their designated positions, and in most cases place them upright, maintaining a consistent and structured layout as shown in figure 68.

4.1.1 Modifications to System

Figure 69 shows the original sets of screwdrivers in the HUB. For each screwdriver head type (FH and PH), there are 3 sizes: S, M, and L. These sizes are visually identical in appearance but differ in scale. This poses a challenge for the model to differentiate between screwdriver sizes, as different camera distance and perspective may result in the same screwdriver appearing larger or smaller under different conditions, leading to ambiguity in size classification. This is particularly so in cases where there is only 1 class of screwdrivers in the scene as the model has no reference to differentiate between different sizes of screwdrivers, thereby leading to wrong predictions.

This issue is particularly significant when only one class of screwdriver is present in the scene, as the model has no reference for relative size comparison. Apart from that, the dark screwdriver tips sometimes blend into the black pegboard, making classification based on screwdriver tips unreliable.

To tackle these challenges, the screwdrivers were color coded with distinct colors to make class classification easier for the model as shown in figure 70.

Other solutions that did not involve color coding the screwdrivers were explored but were not selected after evaluating their trade-offs (see figure 71). Ultimately, color coding was chosen as it is simple and provides the most reliable predictions.

4.1.2 Hardware Tools

Two hardware components form the physical foundation of the system: the Intel RealSense D435if depth camera, which handles scene capture, and the Advantech UNO-238V, which serves as the onboard robot PC (Shown in figure 72). Their specifications are shown in figure 73.

Stream configuration:

To achieve a balance of speed and accuracy, both streams were configured at a resolution of 1280 × 720 at 30 fps.

Inference Hardware:

Although the Advantech PC is equipped with Intel Integrated Graphics (ADL GT2), it does not support CUDA, which is required for GPU-accelerated inference in PyTorch. As a result, inference during deployment is performed on the CPU. This is acceptable for the application, as detection is only triggered once per pick-and-place cycle rather than requiring continuous real-time inference.

4.1.3 Project Phases

The project is divided into three main phases: model development, computer vision pipeline development, and 3D coordinate validation.

The model development phase shown in figure 74 focuses on training the object detection model using a custom dataset. The computer vision pipeline development phase involves integrating the trained model with depth processing to compute 3D coordinates of detected objects. Finally, the 3D coordinate validation phase evaluates the accuracy and stability of the computed coordinates.

4.2 Model Development

4.2.1 Dataset Collection

To ensure consistency between training and inference, the same camera (D435if) used for inference was also used to capture the training images. The dataset consists of a total of 410 images, with 344 of them belonging to the screwdrivers and 66 belonging to the mallet.

As shown in figure 75, the screwdriver dataset was grouped into different variation categories to represent real-world conditions such as changes in distance, lighting, screwdriver arrangements, and robot position. For each variation group, images were captured at two locations (Loc 1 and Loc 2), corresponding to the robot arm’s operating positions during inference. The number of images per group were decided based on how frequent each scenario is expected, ensuring the model is better trained on common conditions while still exposed to less frequent cases for robustness.

As the exact robot position was not defined initially, images from group A-C were captured using a stand approximating the expected position. Once the robot's position was finalized, subsequent images (Group D – I) were captured at the actual position under varied real-world conditions (see appendix H1).

The figures below show some examples of images from the dataset for each variation group:

As for the mallet dataset (Shown in figure 78), as it is only a single tool class with no size variants, only 66 images were collected at location 3 across varied positions and lighting conditions.

4.2.2 Data Annotation

After data collection, all images were annotated to serve as ground truth for model training using Roboflow. Bounding boxes were drawn as tightly as possible around each object. For occluded objects, annotations were only made when the class could be confidently identified from visible features, to avoid introducing ambiguous training signals (see Figure 79).

4.2.3 Model Training and Performance Evaluation

The dataset was split into 3 sets as shown below in figure 80.

Model training and performance evaluation were done on 2 types of YOLO models, namely the YOLOv11n and YOLOv26n. Ultimately, YOLOv11n was the chosen one due to its better performance.

The training configurations for both models are as follows (figure 81):

The test set was used for performance evaluation. The performance comparisons between models are shown in the figure 82 and 83 below. See appendix H2 for definitions of key metrics, and appendix H3 for additional training results.

From the results, YOLOv11n performs better than YOLOv26n across all evaluation metrics, achieving higher mAP, precision, and recall, which means more accurate detections and fewer missed objects. This is reflected in the confusion matrices, where YOLOv11n shows a clean diagonal with minimal errors, while YOLOv26n has more missed detections and slightly more spread. Although YOLOv26n is slightly faster, the difference is small compared to the drop in accuracy. Therefore, YOLOv11n is selected as the final model due to its better overall performance and robustness.

A video comparison is included below to make the performance differences easier to observe, even though the application performs inference on images.

4.3 Computer Vision Pipeline

The computer vision pipeline is deployed as a ROS2 service node (Figure 84). When the software control stack (section 5.4) issues a request specifying a target tool class, the client node activates the RealSense camera, runs the computer vision pipeline described in this section, and returns the estimated XYZ coordinates of the nearest matching target tool back to the client. The camera is then deactivated after each service call. The following subsection details each stage of this pipeline.

4.3.1 Color and Depth Frame Alignment

Two image streams are required: the RGB frame, which is used for object detection using YOLO, and the depth frame, which provides distance information for computing the 3D coordinates of detected objects.

Due to the physical offset between the colour and depth sensors on the D435if, a pixel (cx, cy) in the RGB image does not inherently correspond to the same point in the depth image. Since YOLO operates on the RGB frame and outputs pixel coordinates in the RGB coordinate space, the depth frame is aligned to the RGB frame. This ensures that depth values can be directly looked up at the same pixel coordinates that YOLO produces. In other words, this alignment warps the depth image into the colour camera's perspective, ensuring pixel correspondence between the two streams. (Figure 85)

This stream alignment step is necessary for the deprojection function in Section 4.3.4, which requires each pixel's 2D coordinate and depth value to refer to the same point in space.

4.3.2 Object Detection

For each detected object, the model predicts a bounding box. A grasp point (cx ,cy ) is then defined within this bounding box to facilitate robot grasping. This point represents a pixel location in the 2D RGB image.

For screwdrivers, the grasp point is set at 15% of the bounding box height from the top, targeting the handle end. For the mallet, it is set at 40% from the top, targeting the shaft. This ensures that the computed 3D coordinate corresponds to a physically meaningful grasp location as shown in figure 86.

4.3.3 Depth Filtering

After the grasp point (cx ,cy ) is determined, a reliable depth value is required to compute the corresponding 3D coordinate.

The depth value Z plays a critical role in 3D coordinate computation (see Appendix H4), as it directly affects the accuracy of the computed X and Y values.

However, raw depth data may contain noise or invalid measurements (e.g., zero-depth pixels). Therefore, two stages of filtering to the depth data are applied to improve robustness.

1. SDK Filtering (Figure 87):

2. Patch-based depth sampling

From the depth map (figure 88), we can see that certain regions on the screwdriver handle could still contain zero depth values (black pixels). If only a single depth value at (cx,cy) is used, it may coincide with these invalid pixels, resulting in incorrect 3D coordinates.

To solve this, a 9 × 17 pixel sample patch centered around the grasp point is sampled, as shown in figure 89. Within the patch, pixels with zero depth values are first discarded. Among the remaining pixels, those whose depth values falling within a set tolerance of the patch median are retained (the green pixels). Doing so helps eliminate potential inconsistent or unwanted depth data within the patch. These remaining valid pixels are then passed to the 3D coordinate computation stage described in the next section.

4.3.4 3D Coordinate Computation via Deprojection

Deprojection refers to the process of converting a 2D pixel coordinate into a 3D point in real-world space.

In this pipeline, the RealSense SDK function rs.rs2_deproject_pixel_to_point() is used to perform this conversion. This function takes in the pixel coordinate (px,py), the corresponding depth value Z and the camera intrinsic parameters, to return the 3D coordinates (see figure 90).

To improve robustness, deprojection is not performed on a single pixel. Instead, all valid pixels within the sampled patch from section 4.3.3, are individually deprojected, and the final XYZ is computed as the median of these points. This reduces the effect of depth noise and ensures a more stable estimate of the object position.

As the depth frame is aligned to the RGB frame (Section 4.3.1), and the RGB camera intrinsics are used in the deprojection function, the resulting coordinates are expressed relative to the RGB camera optical frame (see figure 91).

Further details on the underlying projection and deprojection model are provided in Appendix H4 - H5.

4.3.5 Target Selection

Once all detections in the scene have been processed and assigned 3D coordinates, the pipeline filters only those matching the requested tool class. Among the remaining detections, the system returns the coordinate value of the target class that is closest to the camera's optical axis. Closest proximity is computed as the Euclidean distance in the XY plane from the camera origin (0, 0) to the grasp point of each detection, as illustrated in Figure 92.

This approach is used instead of full 3D Euclidean distance as the camera surface may not always be perfectly parallel to the pegboard due to robot misalignment, causing Z values to vary across the scene even when all tools lie on the same physical plane. Including Z would therefore introduce unintended bias into the selection. Thus, restricting the calculation to the XY plane ensures that tool selection remains consistent and predictable with respect to the actual tool position.

The result of the target selection is returned to the client via a service response. In the successful case, the response contains the 3D coordinates (X, Y, Z) of the selected target for downstream motion execution. If no matching object is detected, a failure response is returned, which is displayed as a “NOT FOUND” message on screen, as shown in figure 93.

4.4 3D Coordinate Accuracy Validation

Based on the hardware specifications of the camera as shown in section 4.1.2, the expected worse case depth error for our application distance of 450mm is ± 9mm (because 2% * 450mm = 9mm). As for the tolerance of the gripper fingers, the largest screwdriver fits with an approximate 10 mm clearance, which defines the allowable positioning error.

Some validation tests were done to validate that the coordinates returned by the computer vision pipeline are accurate.

When testing the full computer vision pipeline on the actual deployment conditions, we observed that the grasp point is consistently aligned with the gripper center (figure 94). Of the 150 grip tests done for the screwdriver, 149 were successful whilst 1 failed due to collision with the screwdriver above. The mallet test also showed consistent results as shown in figure 95.

Prior to these validation tests done on the actual deployment scene, some preliminary tests were also conducted to gauge the accuracy of the computed coordinates. Refer to appendix H6 for these tests.

Below are some videos showing the computer pipeline in action.

4.5 Alternative Methods Explored

Before the current pipeline was deployed, additional techniques were explored to improve the precision and reliability of the grasp point and depth computation.

4.5.1 Bounding Box Smoothing

Bounding box smoothing was performed by running object detection across 5 RGB frames and computing the average grasp point (cx, cy) to reduce detection jitter. However, as shown in figure 96, detected grasp point locations varied by at most ~1px across frames, indicating that the bounding box predictions were already sufficiently stable and that averaging provided negligible benefit. This method was therefore not adopted.

4.5.2 Median XYZ Across Multiple Depth Frames

This method was explored to reduce frame-to-frame depth noise by estimating the XYZ coordinates as the median across 10 depth frames per capture. As shown in figure 97, benchmarking over 50 captures showed only a small reduction in variance compared to single-frame estimates (Euclidean standard deviation: 1.832 mm → 1.386 mm). Since the improvement (~0.45 mm) is small, the added computation cost was not justified. Thus, this method was not adopted.

This method, however, may be more useful in scenarios where inference is performed on a live video stream, and the camera or objects are in motion, unlike the static setup in this project where both the camera and tools remain stationary. In such cases, frame-to-frame variation in the estimated XYZ values becomes more pronounced, and aggregating multiple depth frames can improve stability.

4.6 Limitations of Current Pipeline

A limitation of the current pipeline is that it performs poorly on edges such as when the screwdrivers are messy, slanted, or occluded. As the system relies on axis-aligned bounding boxes and a fixed grasp point, it does not account for object orientation. Consequently, the computed grasp point (cx ,cy ) may fall outside the object or on a non-graspable region, such as the shaft. Occlusion can also reduce detection accuracy and lead to incorrect classifications.

Several tests at various screwdriver angles were done.

Firstly, the screwdriver was tilted such that the grasp point lies at the edge of the screwdriver contour (figure 98). As seen from the video below, although the robot still managed to grasp it, it almost didn't.

Next, we tried tilting the screwdriver furthermore, until the grasp point does not lie on the screwdriver anymore (figure 99). As seen from the video below, the gripper went too much forward into the pegboard and did not manage to grab the screwdriver.

To be able to grab slanted objects, other types of models could be used. These will be discussed in section 7.2.

5 Software Control Stack

5.1 Updated software strategy and architecture

From the interim presentation, a user sequence diagram, shown in figure 100, and an initial software architecture diagram, shown in Figure 101, were developed to represent the idealised control flow and software structure of the system.

For the final deployed prototype, the software architecture was further streamlined, as illustrated in the updated diagram shown below in figure 102.

Throughout the development of the final prototype software, the implementation approach was continually refined and, where necessary, restructured to meet the practical operating requirements of the system. The following sections describe in detail the individual nodes and their respective functionalities that enable the overall system to operate.

5.2 Moveit2 Trajectory Planner

The MoveIt2 trajectory planning node is responsible for generating feasible manipulator motion trajectories for the robotic arm. Its purpose is to translate target poses or motion goals into executable arm movements that respect the robot’s kinematic structure, joint limits, and safety constraints. In the developed system, this node is essential for actions such as moving the Cobot to predefined standby positions, approaching a tool holder, aligning with an object, executing pickup or placement motions, and returning the arm to a safe home configuration.

One of the main reasons for using MoveIt2 is its ability to handle inverse kinematics and motion planning in a standardized framework. In a practical system, commanding the robot to move from one pose to another is not simply a matter of assigning Cartesian coordinates. The software must determine whether the target can be reached, which joint configuration should be used, and whether the resulting motion is smooth and safe. MoveIt2 provides this capability while also supporting collision checking, trajectory generation, and planning scene integration when required. This significantly reduces the need for manual low-level trajectory derivation and improves the reliability of arm motion.

From the interim presentation, various trajectory planners were tested and the PILZ trajectory planner was selected for stable trajectory planning and ability for linear trajectory application.

The diagram in figure 103 shows how the node, called the Master Cobot Action Node, interacts with the other software subsystems in the overall architecture. The node’s endpoints feature Ros2 services that allow for the control and data reading of the cobot during its execution. The sequencer node, the node that handles the direct systems, mainly interacts via this service calls which are sequence requests.

5.3 AMR control stack

The SESTO node is responsible for managing communication with the SESTO Magnus Lite AMR platform and providing the rest of the software system with an interface for navigation-related and control functions. Its primary function is to accept location-based commands from the sequencer and convert them into navigation requests that can be understood by the mobile platform. This involves calling external APIs, sending requests to the base controller, or continuously polling platform status to monitor progress.

From the diagram above in figure 104, the main interactions with the software architecture involve the SESTO node accepting service and action requests from the sequencer. These service cand action request are directly tied to API calls that are done to the SESTO’s onboard API server to execute the motions. The table below shows the endpoints utilised for the system.

API endpoints	HTTP Method	Description
/amrs/mode	GET	Retrieves the current AMR mode, typically after maintenance operations.
/amrs/mode	PUT	Sets the AMR operating mode for maintenance or mapping.
/amrs/pause	PUT	To pause the AMR.
/amrs/recovery	POST	Initiates the AMR recovery procedure during initial startup.
/amrs/release/station	POST	Releases the AMR from its docking or station state so that it can accept navigation commands.
/amrs/release/station	GET	Retrieves the AMR station release status.
/amrs/statuses	GET	Retrieves internal AMR status parameters, such as battery state and other system diagnostics.
/goals	POST	Creates a navigation goal for the AMR.
/goals/cancel	POST	Cancels active navigation goals, typically during emergency handling.
/events	PUT	Clears active error or event codes.

The SESTO node also contributes to execution reliability by returning status information back to the sequencer. Once a navigation goal is issued, the sequencer does not simply assume the movement has completed. Instead, it relies on the SESTO node to monitor whether the platform is travelling, has arrived, or has encountered an issue. This is especially important in integrated workflows where downstream actions, such as tool pickup or delivery, should only begin once the base is correctly positioned. Without this feedback mechanism, the system could attempt manipulator operations at the wrong location, leading to failed or unsafe execution.

When the AMR reaches its destination, it will enter in Station-Lock mode, where no navigation requests will be accepted until specifically released. This is used as an intermediary safety layer to prevent any navigation requests from being submitted which may lead to unsafe conditions if the arm also were to be moving simultaneously. The release of this Station-Lock is done via the API endpoint /amrs/release/station and is shown in figure 105.

For the creation of new waypoints, this can also be done using API calls to the system but for convenience, the Fleet Server UI seen in figure 106, onboard the AMR was utilised as the GUI overlay shows the location of the endpoints, leading to easier implementation. Each key point can then be referenced in the API calls.

The navigation accuracy is noted to be a point of concern, repeatability tests were done which shows that after 50 navigation attempts to the same defined location, the AMR can drift about +- 20mm in both X and Y directions. This poses an issue during the pick and place as hardcoded motions will not suffice to grab the items as the whole system can drift far enough that the gripper will miss outright. This inaccuracy issue is solved via the utilisation of computer vision as mentioned in section 4 and in section 5.4.

5.3.1 SESTO Navigation Accuracy and Mitigation

The SESTO Magnus Lite AMR platform exhibits a navigation accuracy issue, with observed drift of approximately ±20 mm in both X and Y directions after repeated navigation attempts to the same location. An example of this drift is shown in figure 107 where the AMR is seen to have deviated from a marker reference. This level of inaccuracy poses a significant challenge for tasks such as pick-and-place operations, where precise positioning is critical for successful item grasping.

Based on the datasheet, the Sesto Magnus Lite has a navigation accuracy of ±20 mm, which is consistent with the observed drift during testing. The exact positional readings can be found in Appendix N. This means that even if the AMR successfully reaches the general vicinity of the target location, there is still a possibility that it may be misaligned by up to 20 mm, which can lead to failed grasps if the robot arm relies solely on hardcoded motions based on expected coordinates.

Consideration was placed for the utilisation of a docking marker that the AMR can reference and dock to. However, the placement of the docking marker and low profile design makes it a tripping hazard for people as seen in figure 108. Hence this idea was scrapped.

To mitigate this issue, the system incorporates computer vision techniques to dynamically adjust the robot's position during pick-and-place tasks. By using visual feedback from the camera, the robot can detect the actual location of the target item and make real-time adjustments to its position, compensating for any drift caused by navigation inaccuracies.

5.4: CAN Bus Communication

The CAN bus node provides the low-level hardware communication interface between the ROS2 software stack, and the physical end-effector or peripheral devices connected over the CAN network. The CAN bus node interfaces with the drivers and corresponding CAN peripherals onboard the UNO 238 V2 MiniPC, whose interfaces are shown in figure 109.

In the context of the robotics system, the CAN node mainly interfaces with the tool changer unit circuits boards, seen in section 3. The interaction between the CAN node and the peripherals can then be illustrated in figure 110 below:

The CAN Node itself opens various services that the sequencer can call. These services are for the control of the tool changer’s electromagnet, power delivery and the gripper’s open and close command. Upon calling any of these services, the CAN Node transmits a CAN frame that can be seen in figure 111.

In this CAN frame, the commands for the electromagnet, power delivery and gripper state are seen in the bit 6, 5 and 4 fields whilst bit 3 to 0 are reserved for other command structures.

Whilst the Tool side control board can also receive CAN frames from the MiniPC, a separated form of control was done instead, with the Robot side control board managing the commands from the MiniPC and monitors the state of the Tool side control board. This is so that in the event of any loss of connection on the tool side, the Robot side can notify the MiniPC.

5.5: Computer Vision

The computer vision node is responsible for processing tool recognition data and returning the depth and detection information required for the system to perform pick-and-place operations or to terminate an order when the requested tool is not detected. The detailed processing methodology is discussed in Section 4, Computer Vision. Accordingly, this section focuses on the role of the computer vision node within the control architecture and its interactions with the rest of the software stack.

The interaction diagram in figure 112 illustrates the relationship between the computer vision node and the wider control stack, with the sequencer node acting as the primary service client.

Upon receiving a request from the sequencer node, the computer vision node takes in the name of the requested tool and begins processing the visual input to determine whether the tool is present. If the request is completed successfully and valid frame data is returned, the sequencer node then computes the new target coordinates required for motion execution. While this process is based on frame transformations, it was simplified into vector calculations under the assumption of fixed frame relationships. The following sections describe this calculation in greater detail.

5.5.1 Target Position Derivation

To command the robot such that the gripper reaches the target detected by the depth camera, a vector-based derivation was used. Under the simplifying assumption that all coordinate frames are aligned, rotational transformations reduce to the identity and only translational offsets need to be considered. The required position can therefore be obtained through direct vector addition and subtraction. Figure 113 illustrates the relevant coordinate frames and vectors involved in the derivation.

5.5.2 Symbol Definitions

Symbol	Description
$\mathbf{p}_1$	Current robot endpoint position in the world frame
${}^{1}\mathbf{p}_2$	Gripper position relative to the robot endpoint
${}^{1}\mathbf{p}_3$	Depth camera position relative to the robot endpoint
${}^{3}\mathbf{p}_4$	Detected target position relative to the depth camera
$\mathbf{p}_4$	Target position in the world frame
$\mathbf{p}_1^{desired}$	Desired robot endpoint position in the world frame
$\mathbf{p}_2^{desired}$	Desired gripper position in the world frame

5.5.3 Derivation

The target position in the world frame is first expressed by summing the current endpoint position, the camera offset from the endpoint, and the detected target offset from the camera:

$$\mathbf{p}_4 = \mathbf{p}_1 + {}^{1}\mathbf{p}_3 + {}^{3}\mathbf{p}_4$$

(1)

To ensure that the gripper coincides with the detected target, the desired gripper position is set equal to the target position:

$$\mathbf{p}_2^{desired} = \mathbf{p}_4$$

(2)

Since the gripper is mounted with a known offset relative to the robot endpoint, the desired gripper position may also be written as:

$$\mathbf{p}_2^{desired} = \mathbf{p}_1^{desired} + {}^{1}\mathbf{p}_2$$

(3)

Substituting Equation (2) into Equation (3) gives:

$$\mathbf{p}_1^{desired} + {}^{1}\mathbf{p}_2 = \mathbf{p}_4$$

(4)

Rearranging for the desired robot endpoint position:

$$\mathbf{p}_1^{desired} = \mathbf{p}_4 - {}^{1}\mathbf{p}_2$$

(5)

Substituting Equation (1) into Equation (5) yields the final expression for the desired endpoint position:

$$\mathbf{p}_1^{desired} = \mathbf{p}_1 + {}^{1}\mathbf{p}_3 + {}^{3}\mathbf{p}_4 - {}^{1}\mathbf{p}_2$$

(6)

5.5.4 Final Expression

$$ \boxed{ \mathbf{p}_1^{desired} = \mathbf{p}_1 + {}^{1}\mathbf{p}_3 + {}^{3}\mathbf{p}_4 - {}^{1}\mathbf{p}_2 } $$

Equation (6) shows that the desired robot endpoint position is obtained by starting from the current endpoint position, adding the camera mounting offset, adding the target offset measured by the depth camera, and subtracting the gripper offset. The gripper term is subtracted because the robot is controlled through the endpoint frame, whereas the actual point required to coincide with the target is the gripper.

After the calculation is completed, the sequencer then moves the end effector into the desired position by sending direct frame goal requests to the MoveIt2 trajectory planner, which then computes the required trajectory to move the robot to the desired position.

5.6: Sequencer

The sequencer node forms the core supervisory layer of the robotic software architecture. Its primary function is to coordinate the execution of tasks across the different subsystems and ensure that operations occur in the correct order. In a multi-node robotic application, individual components such as navigation, motion planning, end-effector control, and user interaction may all function correctly on their own, but the overall system will not behave reliably unless these components are synchronized. The sequencer was therefore implemented to serve as the central decision-making and orchestration layer.

From a functional perspective, the sequencer receives higher-level requests that correspond to complete robotic actions, such as performing a delivery, retrieving a tool, returning to a home pose, or executing a pickup sequence. Rather than directly controlling motors or actuators, the sequencer breaks each high-level task into a series of ordered sub-operations. These sub-operations are then issued to the relevant nodes using ROS2 services, actions, or topic-based communication depending on the nature of the task. This design allows the logic of the task flow to be managed in one place, while hardware-specific or subsystem-specific details remain isolated in their own nodes.

In the diagram shown in figure 114, the sequencer node interfaces with all the relevant hardware control nodes. Since the sequencer can control all items with proper logic, it plays a critical role in fault handling and recovery. In practical robotic systems, failures such as missing tools, unsuccessful grasps, delayed status updates, or unexpected subsystem states are common. To address this, the sequencer includes conditional branches that alter the execution flow based on returned results. For instance, if a requested screwdriver is not detected at the pickup location, the sequencer can record that outcome, skip the failed pickup, and proceed to inspect the next valid request instead of remaining stuck in an incomplete state. If no valid items remain, the node can trigger a safe homing or recovery procedure. This makes the overall system more robust and better suited for semi-structured real-world operation.

5.7: Webserver and User interface

The webserver node functions as the primary human-machine interface for the robotic system. Its role is to allow users to interact with the robot through a browser-based interface rather than requiring direct access to ROS2 commands, terminals, or internal software tools. This is especially important in application-oriented robotic systems where the end users may not be robotics developers and need a simple method to initiate tasks and observe system status.

From a functional standpoint, the webserver accepts user inputs such as delivery requests, task selections, or other operation commands. These inputs are then passed into the backend control logic, where they are interpreted and relayed to the sequencer or relevant subsystem. In this way, the webserver does not directly command the robot at the hardware level but instead serves as a structured interface between user intent and robotic execution. This is shown in the interaction diagram in figure 115.

In the UI, the user has the option to select multiple tools for delivery, which can be seen in the figure 116 below. They can then select which table do they want the deliveries to be delivered to. Upon clicking the Send Request button, the commands are then sent over to the sequencer node, where the system then decides how to control the hardware to complete the order.

In the lower dialogue box, the user can monitor the delivery with parameters. These parameters dynamically update based on the feedback from the sequencer node. A description of these parameters can be seen in the table below:

Parameter	Description
Items to be delivered	This dialogue updates to show the tool selection the users have selected for the robot to perform a delivery.
Unavailable items	During the actual pickup of tools, if the system records a tool as unavailable, the option will be seen here.
Deliver to	The table selected by the user for delivery.
Status of robot	The feedback from the sequencer which shows which hardware is being controlled at which stage. Error messages can also appear here.
Status of delivery	The status of delivery from No active delivery -> In progress -> Delivery completed.

An additional set of buttons and debug console are also present, mainly for resetting the system and conducting tests. This is shown in figure 117.

Figure 118 shows the web UI prototype in its entirety. The videos in section 5.8 also show the webserver in action, where the user can monitor the delivery process and see the feedback from the system.

5.8: Overall software functionality

Figure 119 presents an overview of the ROS 2 services and actions implemented within the system nodes. After all nodes have been successfully initialised and confirmed to be operational, the system is able to accept incoming tool delivery requests. The flowchart below illustrates the corresponding system workflow and internal control logic executed upon receipt and acceptance of a delivery request.

The sequence flowchart in figure 120 shows the idealised flow of the system from the point of receiving a delivery request to the completion of the delivery. The flowchart also includes the various decision points and feedback loops that are present in the system, such as checking for tool availability, monitoring navigation status, and handling edge cases. This flowchart serves as a reference for understanding how the different software components interact to achieve the overall functionality of the robotic system.

The videos below showcase the full system running autonomously with the webserver integration. The first video shows the system collecting and delivering the requested screwdrivers based on user input and the second video shows the system returning back to the home position after the delivery is completed and acknowledged.

6 Edge case testing and safety considerations

This section is to document both the safety features implemented onboard the robot to protect operators and people in the environment alongside edge case handling behaviour should the system encounter them.

6.1 Safety implementation

Onboard the AMR and Cobot, there are integrated safety peripherals (See figure 121)that prevent the system from injuring or harming users. One of the main changes that the team did to improve the safety further was to merge the estop logic to both the AMR and Cobot. This allows for the system to be shut down fully. The following videos showcase these safety peripherals during the use of the system.

6.2 Edge case testing

Beyond normal operating conditions, the robotic system must also be capable of handling edge cases that occur during task execution. In practice, these conditions may arise from missing tools, unsuccessful perception results, communication irregularities, or failures in intermediate action steps. To address this, the system logic was designed with explicit handling pathways for such scenarios, allowing the robot to respond in a controlled and deterministic manner. Depending on the nature of the fault, the system may attempt an alternative sequence, skip the affected task, notify the supervisory interface, or return to a predefined safe state. This behaviour is important in ensuring operational robustness, safety, and continuity of service.

6.2.1 Tool availability and pickup edge cases

This subsection addresses edge cases that occur when a requested tool is unavailable, incorrectly positioned, or cannot be successfully picked up by the robot.

Trigger condition:	Affected task:	Identified by:	System response
Screwdriver requested not detected in CV scanning.	Tool pickup	Computer vision model	If other tools are requested, the system will process those orders. Else, the sequencer node updates the webserver that delivery has failed. Robot returns to idle.
Screwdriver requested is slanted in the holder.	Tool pickup	N.A	The system attempts to pickup the tool with the skewed position. If the tool drops out, the system will continue as per normal.

The video below shows an edge case where no tool are available. The corresponding system response is that the robot will return to the home position and the webserver will show that the delivery has failed.

6.2.2 Motion and manipulation edge cases

This subsection considers edge cases arising during robot motion or physical interaction with tools and fixtures.

Trigger condition:	Affected task:	Identified by:	System response
The Cobot impacts a person or object.	Any sequence.	Onboard torque sensors on the Cobot.	The Cobot will immediately enter alarm state and terminate any further goals for movement. System will need to be restarted.
The AMR detects an obstacle blocking the goal.	Any navigation sequence.	AMR’s onboard LIDAR and camera.	The AMR will sound out to the obstacle to clear the area and reattempts navigation. If the obstacle is not cleared after 5 attempts, the AMR will cancel the task and return to park position.

6.2.3 Communication and control edge cases

This subsection describes faults related to data exchange and coordination between the various subsystems of the robot platform.

Trigger condition:	Affected task:	Identified by:	System response:
The master action node fails to obtain a successful trajectory calculation	Any movement sequence with the Cobot.	Feedback from the MoveIt Action Server	The master action node informs the sequencer node, which then terminates any existing requests. A warning is then sounded for an operator to check the system.
The webserver node loses connection with the sequencer node.	Any tasks.	Feedback monitoring in webserver node.	The webserver node will notify the user on the debug console that the sequencer node has timed out and that the robot may require maintenance.
The sesto node receives an error from the Sesto’s navigation API.	Any navigation goal requests to the sesto node.	Feedback from the sesto node.	The sequencer node will terminate the order and request for an operator to check the robot.
The CAN Bus node does not receive feedback for the can transmission.	During gripper selection.	CAN Node feedback.	The sequencer node will terminate the order and request for an operator to check the robot.

7 User validation and conclusion

7.1 User validation

Following the completion of system development, user validation testing was carried out to evaluate how well the system performed when operated by intended users. The purpose of this testing was not only to verify that the system could complete its intended tasks, but also to assess its usability, clarity of operation, and overall user experience. This chapter describes the testing procedure, presents the results observed during the trials, and discusses the feedback obtained from users after interacting with the system.

Two stages of validation were carried out, the preliminary user validation and primary user validation.

7.1.1 Preliminary user validation

Prior to the main user validation session, a preliminary user validation test was conducted with the primary stakeholder. The purpose of this initial evaluation was to obtain direct feedback on the prototype system before its deployment to the wider user group. As the stakeholder possessed a strong understanding of the intended application and operational requirements, this stage provided an opportunity to identify minor issues in workflow, interface clarity, and system behaviour that could be improved before the primary user validation was carried out.

During this preliminary validation seen in figure 122, the stakeholder interacted with the system and assessed its operation in relation to its intended use case. Feedback was collected through direct observation and discussion, focusing on areas such as usability, task flow, clarity of interaction, and practical suitability. The comments obtained were then used to introduce small refinements to the system to improve its readiness for broader user testing.

7.1.2 Primary user validation

Following the preliminary refinement stage, the system was subjected to primary user validation involving the intended user group. The objective of this stage was to evaluate the system under more realistic user-facing conditions and to assess its usability, effectiveness, and user acceptance after the initial refinements had been incorporated. Figures 123 and 124 show the primary user validation sessions, where users were observed interacting with the system and providing feedback on their experience.

During this round of feedback, key feedback points obtained from the students and stakeholders were on points such as the user experience, suitability for the intended environment and workflow, maintainability of the system and general improvements and recommendations for future works. The summary of each feedback can be found in the table below.

User Experience	Integration of system into Hub’s workflow	Maintainability of the system	General improvements and recommendations
Users thought the controls were intuitive and straightforward. But the waiting time within the system and the lack of visual updates led some users to think that the robot may be stuck.	Overall reception was positive that the prototype showed high promise in being able to reduce/minimise redundant tasks for the staff.	The system’s maintainability was well received by the staff in terms of the hardware accessibility. However, software maintainability and error logging still need high development effort to reach the same ease.	Speed and efficiency for the robot’s motions were the main primary improvement for the stakeholders. Another key improvement is having the robot link up to an inventory management system in the Hub with proper end effectors for picking other tools and inventory in the storeroom.

7.2 Future works and improvements to the system

For each subsystem, spanning both hardware and software, there remain several opportunities for further development to improve the overall efficiency, functionality, and usability of the system.

For the end effector and tool changer subsystems, a more focused design effort on gripper finger geometry may enable the development of a universal gripper capable of handling the full range of tools on the board. In addition, further improvements to the electronic hardware and embedded firmware could expand the level of control available to the Mini PC, thereby increasing the flexibility and functionality of gripper operation.

The primary limitation of the current computer vision pipeline is its inability to reliably grasp tools that are significantly slanted. This is because the object detection model produces only axis-aligned bounding boxes and has no knowledge of the object's orientation. Future work could explore alternative model architectures such as oriented bounding boxes or key point estimation to address this.

7.3 Conclusion and impact

To conclude, this project focused on the development of an integrated robotic system designed to assist Hub staff by reducing repetitive manual tasks and improving overall operational efficiency. Throughout the project, the team applied and synthesised knowledge from multiple disciplines, resulting in the successful development of a functional proof-of-concept prototype.

Although the current system remains a prototype and would require further refinement, optimisation, and targeted development before it is suitable for long-term real-world deployment, it has nevertheless demonstrated the feasibility and potential of such a solution. More importantly, the project highlights how industrial robotic systems can be adapted beyond conventional manufacturing settings to support workflow improvement and efficiency enhancement in other operational environments.

References

Lapusan, C., & et al. (2025). Optimal Design of 3D-Printed Flexible Fingers for Robotic Grippers. Actuators, 14(10), 468. https://doi.org/10.3390/act14100468
Mitsubishi Electric Corporation. (2023). CR800 series controller tracking function instruction manual (BFP-A3520-E).
Mitsubishi Electric Corporation. (2024). *CR800 series controller CR750/CR751 series controller Ethernet function instruction manual* (BFP-A3379-G). Mitsubishi Electric Corporation.
Mitsubishi Electric Corporation. (2024). *CR800-D series controller GOT direct connection extended function instruction manual* (BFP-A3546-G).
Mitsubishi Electric Corporation. (2024). *Mitsubishi Electric industrial robot CR800-05VD controller instruction manual: Controller setup and maintenance* (BFP-A3731-F).
Mitsubishi Electric Corporation. (2025). RV-5AS-D standard specifications manual (BFP-A3727-N).
Mitsubishi Electric Corporation. (2025). *Mitsubishi Electric Industrial Robot RV-5AS Instruction Manual: Robot Arm Setup and Maintenance* (BFP-A3729-F)
Mitsubishi Electric Corporation. (2025). CR800 series controller detailed explanations of functions and operations (BFP-A3478-X).
Mitsubishi Electric. (n.d.). CR800 series controller troubleshooting (Instruction Manual).
Niedziałek, P. W., Figurski, A., & Kowalik, M. P. (2023). Numerical and experimental studies on 3D printed compliant mechanisms in gripper applications. Advances in Science and Technology Research Journal, 17(6), 228–244. https://doi.org/10.12913/22998624/174109
SESTO Robotics Pte Ltd. (2022). SESTO Magnus Element software reference manual (Version 1.6.0).
SESTO Robotics Pte Ltd. (2022). SESTO Magnus operator's manual (Version 1.6).

Appendix A

This appendix is dedicated to the Cobot, Mitsubishi Electric RV5-AS, Specifications and Information.

A1

A2

A3

A4

A5

Appendix B

This appendix is dedicated to the AMR, Sesto Magnus Lite, Specifications and Information.

B1

B2

B3

B4

Appendix C

This appendix is dedicated to any additional data collection on the Hub's environment and space.