Over sixty years of research and development in designing artificial cardiac pacemakers has resulted in remarkable robustness, convenience, and acceptance of these devices. As the number of patients using pacemakers continues to rise, so are the expectations on the invisibility of its design resulting in ever-increasing demands on improved functionality, size shrinkage, power consumption, battery performance and adaptability to individual cardiac arrhythmias. The medical device industry continues to meet these challenges by extending its design with more sensors (e.g., for rate augmentation, sleep apnea, and haemodynamic status) resulting in ever-increasing complexity of the device software.While manual design, supported by exhaustive model-based development, has served well for previous generations of devices, the need for adaptability without sacrificing the basic functional correctness is ever-present. We propose a correct-by-construction reinforcement learning (RL) paradigm to design cardiac pacemakers by integrating formal requirements in RL. To this end, we capture the time-critical requirements for the dual-chamber pacemaker in an expressive real-time formal logic, duration calculus. Correct-by-construction synthesis is an approach to safety-critical system design that advocates the integration of the formal proof-of-correctness of the designed system by automatically refining formal specifications. Focusing on dual chamber pace makers, we propose to develop an efficient shielding-based RL for an expressive class of duration calculus specifications.