Supervisor: Dmytro Fishman
In this project we are exploring the potential of unsupervised approaches for semantic segmentation for nuclei segmentation task, as an alternative to fully supervised models that require large numbers of pixel-wise annotated data.
To measure the effect of a drug on a population of cells, one may be measuring the evolution of this population in a Petri dish, tracking cell locations, their number, and shape. Nuclei segmentation is one of the first steps in the analysis of microscopy images. In the past, nuclei segmentation was done manually; it is a tedious and time-consuming part of the pipeline. Later, computer vision methods were applied. However, they relied on manual parameter adjustment individually for a given experiment. In recent years, neural networks made a massive contribution in the field of computer vision, and now the application of neural networks to medical imaging data is a fast-growing field with promising applications. However, current methods rely heavily on large amounts of carefully annotated data. Creating annotated data sets for medical imaging is an expensive task and time-consuming for corresponding specialists. Hence, exploring potential methods that could work without relying on annotated data is an important direction in the current research.
In the current project, we have focused on fluorescence modality. The available data has the following partition: train data and validation sets consist of 2016 and 504 greyscaled images of resolution 1080 by 1080 respectively. To reduce the number of parameters in the neural network, we have split each image into 4 tiles: top left, top right, bottom left, and bottom right of sizes 540 x 540 each, hence we have obtained a train set of 8064 images and a validation set of 2016 images.
As a model, we have chosen a U-Net¹, which is a widely used deep learning model in medical imaging.
We have implemented U-Net and data loading and preprocessing pipeline following a great tutorial by Aladdin Persson:
The training was performed on Tartu University HPC using NVIDIA Tesla V100 GPU. For most of the experiments, we have used the following parameters: Adam optimizer with learning rate 0.0001, and batch size 8. The best results we obtained during the first 3 epochs, further training degraded the performance.
As a baseline for unsupervised semantic segmentation in the case of fluorescent images, we have used Otsu’s method: a nonparametric and unsupervised method of automatic threshold selection for picture segmentation.⁷
Methods and approaches
One can approach the task of unsupervised learning from different angles. In the current project, we have explored the power of the loss function.
Otsu as a ground truth
Our first approach was to use Otsu’s predictions as a ground truth. The idea was that the network would be able to generalize better than Otsu before it starts overfitting. However, even though on some epochs the network outperformed baseline Otsu, metrics were very unstable, however, this approach inspired us to try it with better predicted masks.
ACWE as a loss for CNN
The traditional method to solve the segmentation task in biomedical imaging was to define some meaningful energy functional and then define a system of differential equations whose solution is a minimum of a corresponding functional. Then, the system of differential equations is discretized and solved with numerical algorithms.
One of the widely used methods in medical imaging segmentation is Active Contours Without Edges² introduced back in 2001.
In deep learning, loss function plays a vital role, since the network extracts information based on what we define as an objective. In the case of the unsupervised setting, we do not have a ground truth to compensate for the lack of a strong objective function. Hence, one of the approaches could be to have a fresh look at the traditional methods in the field and parametrize proven-to-work energy functions as a loss function for the deep neural network.
One of the works³ that parametrizes ACWE energy functional as a loss for CNN was recently presented at MIDL 2020 conference. There, the authors solve a task of unsupervised bone segmentation from synthetic CT-scan images. We have tried the corresponding loss function on our data set.
We have also used masks generated by U-Net trained with ACWE loss as ground truth for training U-net with BCE, and it helped improve the performance.
RFCM as a loss for CNN
We have also evaluated a recently proposed⁴ parameterization of Robust Fuzzy C-Means clustering⁶ as a loss function for CNN.
As one can observe, RFCM loss has 2 hyperparameters: fuzzy factor q and regularization weight beta. q = 1 corresponds to hard clustering, while the larger values result in soft clustering. The regularization parameter penalizes “changes in the value of the membership functions in local neighborhoods.”⁴ We have experimented with different values of q and beta, and the best results were obtained for q = 1 and beta = 0.
Similar to the ACWE loss experiments, we have also used masks generated by U-Net trained with RFCM loss as a ground truth. However, we have also added post-processing of the generated masks, specifically, we have applied erosion⁵ procedure (with 2 by 2 kernel) from the OpenCV library to reduce false positive margin around the predicted nuclei.
We observe that we have outperformed the baseline method. Noticeably, we have obtained significantly larger Recall,
It was a journey of trial and error, however, we have learned a lot from it. We can conclude that the application of unsupervised methods of deep learning in medical imaging is a very promising direction in future research. We have verified that traditional methods can be reformulated as an objective function for unsupervised training of neural networks and perform quite well.
In our future efforts, we would like to test state-of-the-art unsupervised models on our data (for instance, IIC and MaskContrast). We would also like to explore the potential of unsupervised segmentation on a more challenging brightfield modality:
The project has been done as a part of the Neural Networks course at the University of Tartu.
 Ronneberger, Olaf & Fischer, Philipp & Brox, Thomas. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. LNCS. 9351. 234–241. 10.1007/978–3–319–24574–4_28.
 T. F. Chan and L. A. Vese, “Active contours without edges,” in IEEE Transactions on Image Processing, vol. 10, no. 2, pp. 266–277, Feb. 2001, doi: 10.1109/83.902291.
 J. Chen and E. C. Frey. Medical image segmentation via unsupervised convolutional neural net-work. InMedical Imaging with Deep Learning, 2020. https://2020.midl.io/papers/chen20.html
 J. Chen, Y. Li, L. P. Luna, H. Chung, S. Rowe, Y. Du, L. Solnes, and E. Frey. Learning fuzzy clustering for spect/ct segmentation via convolutional neural networks.Medical physics, 2021.
 Pham DL. Spatial models for fuzzy clustering. Comput Vis Image Underst. 2001;84:285–297
 N. Otsu, “A Threshold Selection Method from Gray-Level Histograms,” in IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62–66, Jan. 1979, doi: 10.1109/TSMC.1979.4310076.